* This page is the English translation of the original Italian blog post I wrote in the ICTeam blog at the following link
https://www.icteam.it/blog/online-index-statistics-gathering-for-bulk-loads-pitfall-on-oracle-19c
From the Oracle 9i version when a new index is created, statistics are collected also on the new created object in a silent and completely automatic way. The same principle is used starting from the Oracle 12c version on the direct path operations as CTAS (Create Table AS Select) and IAS (Insert AS Select) on empty tables.
With Oracle 19c this functionality has been extended to indexes that will benefit of the “on the fly” re-gathered statistics with CTAS and IAS operations.
A new row source has been added in execution plan
Indications on the Online Statistics Gathering are also provided in execution plan, a new row source has been added (OPTIMIZER STATISTICS GATHERING ):
SQL> EXPLAIN PLAN FOR 2 CREATE TABLE t (id1, id2, id3, str1, CONSTRAINT t_pk PRIMARY KEY (id1)) AS SELECT rownum id1, rownum id2, rownum id3, dbms_random.string('a', 10) str1 FROM dual connect by level <= 100000; SQL> SELECT * FROM table(dbms_xplan.display(format=>'basic')); ------------------------------------------------- | Id | Operation | Name | ------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | 1 | LOAD AS SELECT | T | | 2 | OPTIMIZER STATISTICS GATHERING | | | 3 | COUNT | | | 4 | CONNECT BY WITHOUT FILTERING | | | 5 | FAST DUAL | | -------------------------------------------------
The presence of the new row source in the execution plan doesn’t give any garantee on the runtime online statistics executions.
Are online statistics collected on any type of index?
Why not, I thought. I imagine that Online Statistics Gathering in 19c does something like a call to dbms_stats.gather_table_stats with cascade=> true parameter so that the statistics can also be calculate on indexes. Running some test I found that Online Statistics Gathering doesn’t run on all type of available indexes.
The following test run a CTAS with three column (id1,id2 e id3) : on field id1 will be built the PK index, on field id2 will be created one UNIQUE index UNIQUE and on field id3 will be created an index with duplicate values. The Oracle database version used is 19.5:
SQL> select banner_full from v$version; BANNER_FULL ------------------------------------------------------------------------ Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.5.0.0.0
I have created a table T:
SQL> CREATE TABLE t (id1, id2, id3, str1, CONSTRAINT t_pk PRIMARY KEY (id1)) AS SELECT rownum id1, rownum id2, rownum id3, dbms_random.string('a', 10) str1 FROM dual connect by level <= 100000;
And I look at the statistics recalculation time plus other reference values:
SQL> SELECT table_name, num_rows, blocks, last_analyzed, stale_stats FROM USER_TAB_STATISTICS;
TABLE_NAME NUM_ROWS BLOCKS LAST_ANALYZED STALE_STATS ----------- ---------- ---------- ---------------------- ----------- T 100000 441 13-nov-2019 15:45:32 NO
Next I created a UNIQUE index and a normal B-Tree index; at the end, the table T will have three indexes:
SQL> CREATE UNIQUE INDEX t_unique on t(id2); SQL> CREATE INDEX t_no_unique on t(id3); SQL> SELECT index_name, last_analyzed, stale_stats FROM USER_IND_STATISTICS WHERE table_name='T'; INDEX_NAME LAST_ANALYZED STALE_STATS -------------- ---------------------- ----------- T_PK 13-nov-2019 15:45:32 NO T_UNIQUE 13-nov-2019 15:45:32 NO T_NO_UNIQUE 13-nov-2019 15:45:32 NO
One of the Online Statistics Gathering prerequisites on the direct path mode operations is to have any data in the table, the table must be empty as indicated in the Oracle documentation:
“Specifically, bulk loads do not gather statistics automatically when any of the following conditions applies to the target table, partition, or subpartition:
The object contains data. Bulk loads only gather online statistics automatically when the object is empty. “
If you want to see all the other conditions, refer to the following link:
I then execute a TRUNCATE TABLE on table T to put myself in the indicated conditions, the table must not contain data:
SQL> truncate table t;
After executing the TRUNCATE statement the statistics on the table and on the indexes become STALE:
SQL> SELECT table_name, num_rows, blocks, last_analyzed, stale_stats FROM USER_TAB_STATISTICS; TABLE_NAME NUM_ROWS BLOCKS LAST_ANALYZED STALE_STATS ------------ ---------- ---------- ---------------------- ----------- T 100000 441 13-nov-2019 15:45:32 YES SQL> SELECT index_name, last_analyzed, stale_stats FROM USER_IND_STATISTICS WHERE table_name='T'; INDEX_NAME LAST_ANALYZED STALE_STATS ------------ ---------------------- ----------- T_PK 13-nov-2019 15:45:32 YES T_UNIQUE 13-nov-2019 15:45:32 YES T_NO_UNIQUE 13-nov-2019 15:45:32 YES
I execute an insert in direct path:
SQL> INSERT /*+ append */ INTO t SELECT rownum id1, rownum id2, rownum id3, dbms_random.string('a', 10) str1 FROM dual connect by level <= 100000; 100000 righe create. SQL> commit; Commit completato SQL> SELECT table_name, num_rows, blocks, last_analyzed, stale_stats FROM USER_TAB_STATISTICS; TABLE_NAME NUM_ROWS BLOCKS LAST_ANALYZED STALE_STATS ------------ ---------- ---------- --------------------- ----------- T 100000 441 13-nov-2019 15:45:50 NO SQL> SELECT index_name, last_analyzed, stale_stats FROM USER_IND_STATISTICS WHERE table_name='T'; INDEX_NAME LAST_ANALYZED STALE_STATS -------------- ----------------------- ----------- T_PK 13-nov-2019 15:45:50 NO T_UNIQUE 13-nov-2019 15:45:50 NO T_NO_UNIQUE 13-nov-2019 15:45:32 YES
after executing the insert in direct path, I found an “anomaly”: online statistics were performed as expected on the table, on the primary key and on the UNIQUE index, but they were not performed on the B-tree index, that admits duplicate values (T_NO_UNIQUE index) .
The statistics on the T_NO_UNIQUE index remain STALE, Online Statistics Gathering were not performed on this index, the time indicated in the last_analyzed column remained unchanged.
Tracing dbms_stats on online statistics gathering
I then verified the execution of online statistics by executing a trace dbms_stats:
modify the file name and open the trace
alter session set tracefile_identifier = 'online_stats_trace'; exec dbms_stats.set_global_prefs('TRACE',4+8+16+32);
run a direct path insert
INSERT /*+ append */ INTO t SELECT rownum id1, rownum id2, dbms_random.string('a', 10) str1 FROM DBA_OBJECTS; commit;
close the trace
exec dbms_stats.set_global_prefs('TRACE',0);
In the output of the trace file I find the indication that the Online Statistics Gathering for bulk load have been executed, the following sentence found in the trace file confirms it:
“DBMS_STATS: postprocess online optimizer stats gathering for SETTEMBRINO.T”
below is the extract of the trace file:
Tracing online statistics with 10046 event
In the trace file generated by activating 10046 event I then identified a call to two undocumented DBMS_STATS procedures which should then be the two procedures that deal with running Online Statistics Gathering:
dbms_stats.postprocess_stats(:owner, :tabname, :tobj, :fobjn, :flags, :rawstats, :selmap, :clist, null, null);
and
dbms_stats.postprocess_indstats(:ilist, :flags);
I don’t think I’m wrong by saying that postprocess_stats is the DBMS_STATS procedure that performs online statistics on the table, while the postprocess_indstats is the procedure that deals with executing them on the indexes. To confirm what I just said, I identified two postprocess_stats calls, one immediately after the CTAS and one after the IAS and a postprocess_indstats call immediately after the IAS.
Summary
From the tests it would seem therefore that Online Statistics Gathering for Bulk Loads are not executed on B-tree indexes that allow duplicate values. I can’t find plausible reasons to justify the behavior found in the tests, so I have to assume that it is a bug. Moreover, Online Statistics Gathering are not performed even on bitmap and constraint deferrable indexes.
Any comments and/or corrections they are as always welcome:)
Ciao Donatello,
I just noticed that in 20c it works as expected. I.e. object statistics are gathered also for non-unique indexes.
Cheers,
Chris
Ciao Chris,
Thanks for te info.
Se you soon.
Ciao,
Donatello