Decoding DBA_TAB_COLUMNS high_value and low_value
13/11/2015 5 Comments
When performance tuning, it can be important to understand the statistics in the database. It is worth noting that every column records it lowest (low_value) and highest (high_value) values in DBA_TAB_COLUMNS when you gather stats, like a mini-histogram. If the optimizer believes it knows the range of values within a column, plus the number of different values (cardinality), it can make guesses about the suitability and cost of index access when a predicate references that column.
This is especially important when looking at values outside of the low_value and high_value range, as Oracle crudely calculates the probability that such a value exists. It does this by identifying the total range of values (high_value – low_value) and lowering the calculated cost for predicates increasingly further outside of the known range until it decides that there’s no realistic chance of hitting the value you have asked for.
For example:
If low_value was 40 and high_value was 50, this gives us a range of 10. Therefore, any predicate selecting outside of the values 40-to-50 will have it’s cost lowered by a percentage of the range (10). If we assume an access cost of 1000, this is how the index access cost may change.
where X = 40 ; cost = 1000 - within low-to-high value where X = 50 ; cost = 1000 - within low-to-high value where X = 55 ; cost = 500 - outsize of low-to-high value by 50% of the range where X = 32 ; cost = 200 - outsize of low-to-high value by 80% of the range where X = 61 ; cost = 1 - outside of low-to-high value by more than 100% of the range where X = 29 ; cost = 1 - outside of low-to-high value by more than 100% of the range
This is called statistical decay – the calculated lowering of the cost because Oracle doesn’t think it will find the value in the index.
Querying the LOW_VALUE and HIGH_VALUE
The low_value and high_value in DBA_TAB_COLUMNS is stored in an Oracle internal format, so you need to use DBMS_STATS.CONVERT_RAW_VALUE to turn it into a human-readable format. Here’s an example:
Table Definition:
08:08:24 SQL> desc nc1 Name Null? Type ----- ----- ---- C1 DATE C2 TIMESTAMP(6) C3 VARCHAR2(10) C4 NUMBER C5 BINARY_FLOAT C6 BINARY_DOUBLE
Select all of the data out of the table so we can see what we have
C1 C2 C3 C4 C5 C6 ------------------- ----------------------------- ---------- ---------- ---------- ---------- 21/07/0001 01:02:03 21/07/0001 01:02:03.456789 Lower 1 1.0E+000 1.0E+000 21/07/9999 01:02:03 21/07/9999 01:02:03.456789 Upper 99 9.9E+001 9.9E+001 31/12/2015 00:00:00 12/11/2015 08:08:24.586342 Now 50 5.0E+001 5.0E+001 31/12/2015 00:00:00 12/11/2015 08:08:24.585173 Now 50 5.0E+001 5.0E+001 31/12/2015 00:00:00 12/11/2015 08:08:24.579977 Now 50 5.0E+001 5.0E+001 31/12/2015 00:00:00 12/11/2015 08:08:24.585896 Now 50 5.0E+001 5.0E+001
Stats have been gathered, so we can look at DBA_TAB_COLUMNS (or USER_TAB_COLUMNS), and decode the high and low values
TABLE COLUM DATA_TYPE LOW_VALUE HIGH_VALUE LOW_DECODE HIGH_DECODE ----- ----- --------------- ------------------------------ ------------------------------ ------------------------------ ------------------------------ NC1 C1 DATE 64650715020304 C7C70715020304 0001-07-21 01:02:03 9999-07-21 01:02:03 NC1 C2 TIMESTAMP(6) 646507150203041B3A0C08 C7C707150203041B3A0C08 0001-07-21 01:02:03 9999-07-21 01:02:03 NC1 C3 VARCHAR2 4C6F776572 5570706572 Lower Upper NC1 C4 NUMBER C102 C164 1 99 NC1 C5 BINARY_FLOAT BF800000 C2C60000 1.0E+000 9.9E+001 NC1 C6 BINARY_DOUBLE BFF0000000000000 C058C00000000000 1.0E+000 9.9E+001
So how did I decode all of that? I used the 12C capability of putting in-line functions in a SQL statement using subquery refactoring. Pre-12C, you could create the functions as objects [UPDATE: Code modified to add decode for partitions]
WITH FUNCTION raw_to_date(i_var in raw) return date as o_var date; begin dbms_stats.convert_raw_value(i_var,o_var); return o_var; end; FUNCTION raw_to_number(i_var in raw) return number as o_var number; begin dbms_stats.convert_raw_value(i_var,o_var); return o_var; end; FUNCTION raw_to_varchar2(i_var in raw) return varchar2 as o_var varchar2(32767); begin dbms_stats.convert_raw_value(i_var,o_var); return o_var; end; FUNCTION raw_to_float(i_var in raw) return binary_float as o_var binary_float; begin dbms_stats.convert_raw_value(i_var,o_var); return o_var; end; FUNCTION raw_to_double(i_var in raw) return binary_double as o_var binary_double; begin dbms_stats.convert_raw_value(i_var,o_var); return o_var; end; select utc.table_name,utc.column_name,utc.data_type,upcs.partition_name, decode(substr(utc.data_type,1,9), 'DATE' , to_char(raw_to_date(utc.low_value),'YYYY-MM-DD HH24:MI:SS') , 'TIMESTAMP', to_char(raw_to_date(utc.low_value),'YYYY-MM-DD HH24:MI:SS') , 'NUMBER', to_char(raw_to_number(utc.low_value)), 'VARCHAR2', to_char(raw_to_varchar2(utc.low_value)), 'BINARY_FL', to_char(raw_to_float(utc.low_value)), 'BINARY_DO', to_char(raw_to_double(utc.low_value)),'Unknown') low_decode, decode(substr(utc.data_type,1,9), 'DATE' , to_char(raw_to_date(utc.high_value),'YYYY-MM-DD HH24:MI:SS') , 'TIMESTAMP', to_char(raw_to_date(utc.high_value),'YYYY-MM-DD HH24:MI:SS') , 'NUMBER', to_char(raw_to_number(utc.high_value)), 'VARCHAR2', to_char(raw_to_varchar2(utc.high_value)), 'BINARY_FL', to_char(raw_to_float(utc.high_value)), 'BINARY_DO', to_char(raw_to_double(utc.high_value)),'Unknown') high_decode, decode(substr(utc.data_type,1,9), 'DATE' , to_char(raw_to_date(upcs.low_value),'YYYY-MM-DD HH24:MI:SS') , 'TIMESTAMP', to_char(raw_to_date(upcs.low_value),'YYYY-MM-DD HH24:MI:SS') , 'NUMBER', to_char(raw_to_number(upcs.low_value)), 'VARCHAR2', to_char(raw_to_varchar2(upcs.low_value)), 'BINARY_FL', to_char(raw_to_float(upcs.low_value)), 'BINARY_DO', to_char(raw_to_double(upcs.low_value)),'Unknown') part_low_decode, decode(substr(utc.data_type,1,9), 'DATE' , to_char(raw_to_date(upcs.high_value),'YYYY-MM-DD HH24:MI:SS') , 'TIMESTAMP', to_char(raw_to_date(upcs.high_value),'YYYY-MM-DD HH24:MI:SS') , 'NUMBER', to_char(raw_to_number(upcs.high_value)), 'VARCHAR2', to_char(raw_to_varchar2(upcs.high_value)), 'BINARY_FL', to_char(raw_to_float(upcs.high_value)), 'BINARY_DO', to_char(raw_to_double(upcs.high_value)),'Unknown') part_high_decode FROM user_tab_columns utc LEFT OUTER JOIN user_part_col_statistics upcs on (utc.table_name = upcs.table_name AND utc.column_name = upcs.column_name) order by utc.table_name,utc.column_name,upcs.partition_name;
NOTE: There is no function to decode TIMESTAMPs. You can use them as input to the DATE convert (as above) but you will lose precision, or you can convert them manually:
select column_name, low_value, lpad(to_char(100*to_number(substr(low_value,1,2),'XX') +to_number(substr(low_value,3,2),'XX')) - 10100,4,'0') ||'-' ||lpad( to_number(substr(low_value, 5,2),'XX'),2,'0')||'-' ||lpad( to_number(substr(low_value, 7,2),'XX'),2,'0')||' ' ||lpad((to_number(substr(low_value, 9,2),'XX')-1),2,'0')||':' ||lpad((to_number(substr(low_value,11,2),'XX')-1),2,'0')||':' ||lpad((to_number(substr(low_value,13,2),'XX')-1),2,'0')||'.' ||lpad((to_number(substr(low_value,15,8),'XXXXXXXX')-1),8,'0') low_decode, high_value, lpad(to_char(100*to_number(substr(high_value,1,2),'XX') +to_number(substr(high_value,3,2),'XX')) - 10100,4,'0') ||'-' ||lpad( to_number(substr(high_value, 5,2),'XX'),2,'0')||'-' ||lpad( to_number(substr(high_value, 7,2),'XX'),2,'0')||' ' ||lpad((to_number(substr(high_value, 9,2),'XX')-1),2,'0')||':' ||lpad((to_number(substr(high_value,11,2),'XX')-1),2,'0')||':' ||lpad((to_number(substr(high_value,13,2),'XX')-1),2,'0')||'.' ||lpad((to_number(substr(high_value,15,8),'XXXXXXXX')-1),8,'0') high_decode from user_tab_columns where data_type like 'TIMESTAMP%' / COLUM LOW_VALUE LOW_DECODE HIGH_VALUE HIGH_DECODE ----- ------------------------------ ------------------------------ ------------------------------ ------------------------------ C2 646507150203041B3A0C08 0001-07-21 01:02:03.45678899 C7C707150203041B3A0C08 9999-07-21 01:02:03.45678899
Neil,
a nice idea to use PL/SQL in the with clause in 12c. In earlier versions (and if I am allowed to create my own functions in the database) I think I would prefer not to create a lot of single type functions but rather use Greg Rahn’s display_raw: http://structureddata.org/2007/10/16/how-to-display-high_valuelow_value-columns-from-user_tab_col_statistics/.
Regards
Martin
LikeLike
Getting ‘DBA’ objects created is sometimes close to impossible, especially at short notice when things are going wrong and you’re trying to analyze a performance problem.
Gregs code is good – never seen it before, or I might not have written this 🙂
LikeLike
Hello Neil
this is all very good, but how do you decode them low and high values in 11g?
LikeLike
For dates and timestamps, use the code at the end of the blog post (remove the last decode if necessary).
For most of the other you can use UTL_RAW to resolve the values, something like…
Neil
LikeLike
thanks for routine. have modified it with a WHERE clause to target values and ordering it by the column order rather than name.
FROM dba_tab_columns utc LEFT OUTER JOIN dba_part_col_statistics upcs
ON (utc.table_name = upcs.table_name
AND utc.column_name = upcs.column_name)
WHERE UTC.OWNER = ”
AND utc.table_name = ”
–AND upcs.COLUMN_NAME = ”
–AND upcs.COLUMN_NAME LIKE ‘%’
–AND upcs.PARTITION_NAME = ”
–AND upcs.PARTITION_NAME LIKE ‘%’
order by utc.table_name,
UTC.COLUMN_ID,
— utc.column_name,
upcs.partition_name;
LikeLike