Friday, 5 January 2018

WRI$_OPTSTAT_HISTHEAD_HISTORY and SM/OPTSTAT in SYSAUX Tablespace. Why is getting full and how to reorg it.

The SYSAUX Tablespace can get full due to many reasons but in my case it was due SM/OPSTATS segments(WRI$_OPTSTAT_HISTHEAD_HISTORY and their indexes) which occupied most of the space. How to fix the issue and avoid adding more and more space to the tablespace?. This post shows the root cause and how to repair it and it's based on real production datawarehouse(30TB) running on Exadata X4-2. DB Version is 11.2.0.4 BP20.

Let start with standard query checking the "occupants" in SYSAUX tablespace:

set linesize 120
set pagesize 100
COLUMN "Item" FORMAT A25
COLUMN "Space Used (GB)" FORMAT 999.99
COLUMN "Schema" FORMAT A25
COLUMN "Move Procedure" FORMAT A40
SELECT  occupant_name "Item",
    space_usage_kbytes/1048576 "Space Used (GB)",
    schema_name "Schema",
    move_procedure "Move Procedure"
FROM v$sysaux_occupants
ORDER BY 2 desc
/

Item                      Space Used (GB) Schema                    Move Procedure
------------------------- --------------- ------------------ ----------------------------------------
SM/OPTSTAT                         132.57 SYS
SM/AWR                              47.35 SYS
JOB_SCHEDULER                        7.79 SYS
SM/ADVISOR                           2.47 SYS
XDB                                   .06 XDB                XDB.DBMS_XDB.MOVEXDB_TABLESPACE
...
...
AUDIT_TABLES                          .00 SYS                DBMS_AUDIT_MGMT.move_dbaudit_tables
ORDIM                                 .00 ORDSYS             ordsys.ord_admin.move_ordim_tblspc

Almost 132GB is occupied by SM/OPSTATS and WRI$_OPTSTAT_HISTHEAD_HISTORY table with two indexes on it:

select segment_name,segment_type,trunc(bytes/1024/1024/1024) gb from dba_segments where segment_name in ('WRI$_OPTSTAT_HISTHEAD_HISTORY','I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST','I_WRI$_OPTSTAT_HH_ST');

SEGMENT_NAME                     SEGMENT_TYPE               GB
-------------------------------- ------------------ ----------
WRI$_OPTSTAT_HISTHEAD_HISTORY    TABLE                      57
I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST    INDEX                      37
I_WRI$_OPTSTAT_HH_ST             INDEX                      27

Next we check the retention of statistic history, how long should they be stored in the database(default is 1 month):

select dbms_stats.get_stats_history_retention from dual;
GET_STATS_HISTORY_RETENTION
---------------------------
                         31


So we have here default value of 1 month but in my database we have history of one year:

SQL> select /*+full(t) parallel(8)*/ trunc(sysdate)-trunc(min(savtime)) from sys.WRI$_OPTSTAT_HISTHEAD_HISTORY t;
TRUNC(SYSDATE)-TRUNC(MIN(SAVTIME))
----------------------------------
                               350

Why? The default job which purge this table is limited to 5 minutes. And if it could not finish on time, then you would not get any information in alertlog about aborted job and the space consumption in SYSAUX will grow.

In internal table wri$_optstat_opr which store the information about gathering/restoring/purging stats, we don's see anything:

select count(*) from sys.wri$_optstat_opr where operation='purge_stats';

  COUNT(*)
----------
         0

Bare in mind that WRI$_OPTSTAT_HISTHEAD_HISTORY table stores history information about column statistics per table/partition/subpartitions, so for example for partitioned table with 100 partition with 15 columns one run of gather_table_stats will generate = 100 x 15 +1x15 = 1515 rows!

I've checked this with simple test on in my VirtaulBox lab:

SQL> exec dbms_stats.purge_stats(sysdate);
PL/SQL procedure successfully completed.

SQL> select count(*) from sys.WRI$_OPTSTAT_HISTHEAD_HISTORY t;


  COUNT(*)
----------
         0


SQL> create table table_purge partition by hash(object_id) partitions 100 as select * from dba_objects;
Table created.

SQL> select count(*) from dba_tab_columns where table_name='TABLE_PURGE';
  COUNT(*)
----------
        15


SQL> exec dbms_stats.gather_table_stats('SYS','TABLE_PURGE');
PL/SQL procedure successfully completed.

SQL> select count(*) from sys.WRI$_OPTSTAT_HISTHEAD_HISTORY t;
  COUNT(*)
----------
      1515

Imagine that is pretty common(like in my production datawarehouse) that after some application release+bunch of batch loading jobs database gets some new columns and thousands of partition/subpartition. After it, lets gather the statistics for all your tables and you have problem:) Purge job will not finish within the limit to "DELETE FROM".

So what is the next step? We use a simple loop to remove old entries. This step can take lot of time as under the hood a simple delete from ... is running.

begin
for i in reverse 31..360
loop
dbms_application_info.set_action(sysdate-i);
dbms_stats.purge_stats(sysdate-i);
end loop;
end;
/

Once it finish, you can check how many empty block the table has:

set serveroutput on
exec dbms_output.enable(1000000);

declare
   unf number;
   unfb number;
   fs1 number;
   fs1b number;
   fs2 number;
   fs2b number;
   fs3 number;
   fs3b number;
   fs4 number;
   fs4b number;
   full number;
   fullb number;

   own dba_tables.owner%type;
   tab dba_tables.table_name%type;

   yesno varchar2(3);

   type parts is table of dba_tab_partitions%rowtype;
   partlist parts;

   type cursor_ref is ref cursor;
   c_cur cursor_ref;

begin

   own:=upper('&owner');
   tab:=upper('&table_name');
   dbms_output.put_line('--------------------------------------------------------------------------------');

   open c_cur for select partitioned from dba_tables
 where owner=own and table_name=tab;
   fetch c_cur into yesno;
   close c_cur;

   dbms_output.put_line('Owner:     '||own);
   dbms_output.put_line('Table:     '||tab);

   dbms_output.put_line('------------------------------------------------');

   if yesno='NO'
   then
     dbms_space.space_usage(own,tab,'TABLE',unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb);
     dbms_output.put_line('unf: '||unf||' fs1: '||fs1||' fs2: '||fs2||' fs3: '||fs3||' fs4: '||fs4||' full: '||full);

   else
     open c_cur for select * from dba_tab_partitions
       where table_owner=own and table_name=tab;
     fetch c_cur bulk collect into partlist;
     close c_cur;
    
     for i in partlist.first .. partlist.last
     loop
       dbms_space.space_usage(partlist(i).table_owner,partlist(i).table_name,'TABLE PARTITION',unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,partlist(i).partition_name);
       dbms_output.put_line('Partition: '||partlist(i).partition_name);
       dbms_output.put_line('unf: '||unf||' fs1: '||fs1||' fs2: '||fs2||' fs3: '||fs3||' fs4: '||fs4||' full: '||full);
     end loop;

   end if;
   dbms_output.put_line('--------------------------------------------------------------------------------');

end;
/
--------------------------------------------------------------------------------
Owner:     SYS
Table:     WRI$_OPTSTAT_HISTHEAD_HISTORY
------------------------------------------------
unf: 17204 fs1: 109 fs2: 152 fs3: 180 fs4: 6830726 full: 708823
--------------------------------------------------------------------------------

fs4 means empty blocks. We have 6830726 of them after purging.
As the most of space is empty lets reorganize the table and rebuild Function-Based Indexes:

alter index "SYS"."I_WRI$_OPTSTAT_HH_ST" unusable;
alter index "SYS"."I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST" unusable;
alter table "SYS"."WRI$_OPTSTAT_HISTHEAD_HISTORY" move parallel 8;
alter index "SYS"."I_WRI$_OPTSTAT_HH_ST" rebuild online parallel 8;
alter index "SYS"."I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST" rebuild online parallel 8;
alter index "SYS"."I_WRI$_OPTSTAT_HH_ST" noparallel;
alter index "SYS"."I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST" noparallel;

At the end we can check the size of this 3 segments(table+2 indexes) and check the block allocation from WRI$_OPTSTAT_HISTHEAD_HISTORY table:

select segment_name,segment_type,trunc(bytes/1024/1024/1024) gb from dba_segments where segment_name in ('WRI$_OPTSTAT_HISTHEAD_HISTORY','I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST','I_WRI$_OPTSTAT_HH_ST');

SEGMENT_NAME                    SEGMENT_TYPE               GB
------------------------------- ------------------ ----------
WRI$_OPTSTAT_HISTHEAD_HISTORY   TABLE                       5
I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST   INDEX                       2
I_WRI$_OPTSTAT_HH_ST            INDEX                       1


--------------------------------------------------------------------------------
Owner:     SYS
Table:     WRI$_OPTSTAT_HISTHEAD_HISTORY
------------------------------------------------
unf: 0 fs1: 0 fs2: 0 fs3: 0 fs4: 0 full: 712575

--------------------------------------------------------------------------------
Before 132GB and after 8GB. Not bad.


Item                      Space Used (GB) Schema                    Move Procedure
------------------------- --------------- ------------------------- ----------------------------------------
SM/AWR                              47.93 SYS
SM/OPTSTAT                          19.62 SYS
JOB_SCHEDULER                        7.79 SYS
SM/ADVISOR                           2.49 SYS
...
...
AUDIT_TABLES                          .00 SYS                       DBMS_AUDIT_MGMT.move_dbaudit_tables
ORDIM                                 .00 ORDSYS                    ordsys.ord_admin.move_ordim_tblspc