Preserving Statistics During Export / Import
Some database migrations are still performed using export / import (either datapump or the original export / import); for both of these the optimizer statistics are preserved during the export / import but there are a few subtleties to be aware of. For example, if you run dbms_stats.gather_schema_stats shortly after an import you will typically lose most of the existing column histograms, obviously if the histograms are important their loss will cause problems.
Let us discuss (briefly) why / how this could happen. The default statistics gathering routine in Oracle (since version 10) is for the database to run a scheduled job during a maintenance window. In 10g you can see this job via
select owner, job_name, enabled, state, last_start_date from dba_scheduler_jobs where job_name = ‘GATHER_STATS_JOB’
in 11g via
select task_name, status, last_good_date from dba_autotask_task where task_name = ‘gather_stats_prog’
This job (effectively) calls dbms_stats with the default of Oracle recommended setting for the ‘method_opt’ parameter which is ‘for all columns size auto’ which according to the documentation “this setting means that DBMS_STATS decides which columns to add histogram to where it believes that they may help to produce a better plan.” Although it is not documented in great detail the ‘decides’ part of this depends upon two things
a) If Oracle detects ’skew’ in the values in a column
b) If that column is used in the WHERE clause
In other words and here is the key point, by default, Oracle will decide what columns should have histograms based on both the skew and the workload and immediately after an initial import the workload history is missing. This ‘workload history’ is actually recorded in the table sys.col_usage$, you can inspect it using the query
select O.OBJECT_NAME, C.NAME, CU.* from SYS.COL$ C, SYS.COL_USAGE$ CU, DBA_OBJECTS O where C.OBJ# = CU.OBJ# AND C.INTCOL# = CU.INTCOL# AND C.OBJ# = O.OBJECT_ID AND O.OWNER = 'YOUR_USER' order by 1,2
Given this here are some suggestions, the examples use a single schema but obviously could be extended to multiple schemas easily enough.
Prior to export save away the existing statistics using the commands
You should not need to restore these statistics but it is better to have them available if required
Prior to export, record the number of histograms that currently exist using
where histogram <> ‘NONE’ and
owner = ‘YOUR_USER’
After completion of the import, check the same histograms exist i.e. run the same query as in 2 on the imported database. If the number of histograms differs significantly or you encountered other issues with the statistics you can import the statistics explicity via
under most conditions you will not need to do this.
If the number of histograms match then (optionally) re-gather the statistics keeping the existing histograms using the ‘repeat’ option for method_opt i.e.
METHOD_OPT=>’FOR ALL COLUMNS SIZE REPEAT’);
(Obviously you should adjust the estimate_percent depending upon your data volume).
Confirm that you have the same histograms as before using the query in 2.
You will now have updated statistics but have preserved the previously generated histograms. Over time Oracle will fill in the col_usage$ information based on the workload which should ensure the histograms that are valuable are preserved.