Knowledge increases exponentially. Today, you probably own more books than great universities of times past—Cambridge University owned less than two hundred books in the fifteenth century. First came the invention of writing, then alphabets, then paper, then the printing press, then mechanization. Each step caused an exponential increase in the collective human knowledge. In our generation, Al Gore invented the internet and the last barriers to the spread of knowledge have been broken. Today, everybody has the ability to contribute, communicate, and collaborate. We are all caught up in a tsunami, an avalanche, a conflagration, a veritable explosion of knowledge for the betterment of humankind. This is the blog of the good folks at Database Specialists, a brave band of Oracle database administrators from the great state of California. We bid you greeting, traveler. We hope you find something of value on these pages and we wish you good fortune in your journey.

Resizing datafiles and understanding the High Water Mark

Resizing datafiles is something that I seem to do a lot of.  Usually, I am increasing them to allow for data growth but sometimes I need to shrink them to reclaim unused space.  When shrinking a datafile, you cannot make it smaller than the highest allocated extent in the datafile, aka the High Water Mark (HWM).  If you do, you will get a message like this:

alter database datafile '/oradata/DB10201/test.dbf' resize 120m;
alter database datafile '/oradata/DB10201/test.dbf' resize 120m
*
ERROR at line 1:
ORA-03297: file contains used data beyond requested RESIZE value

Below is a quick demo that shows how this works and how to determine exactly how small you can make your datafile, eliminating the trial-and-error approach.  You can also determine exactly which objects are preventing you from shrinking the datafile. First, I will create a 500MB tablespace and a couple of tables.

 create tablespace test2 datafile '/oradata/DB10201/test.dbf' size 500m uniform size 1m;
 create table test (
COL1 NUMBER,
COL2 VARCHAR2(2000),
COL3 VARCHAR2(2000),
COL4 VARCHAR2(2000)) tablespace test2;
declare
v_string varchar2(2000);|
begin
for i in 1..1000 loop
v_string := v_string||'XX';
end loop;
for i in 1..10000 loop
      insert into test values (i,v_string,v_string,v_string);     
end loop;
commit;
end;
/

 how big is the TEST table?

select sum(bytes)/1024/1024 from dba_segments where segment_name='TEST'
and tablespace_name='TEST2'; 
SUM(BYTES)/1024/1024
--------------------
                  80

What is the HWM for TEST?

select
max((block_id + blocks-1)*8192))/1024/1024 "HWM (MB)"
from dba_extents  where owner='SYS' and file_id=6
and segment_name='TEST';
HWM (MB)
----------
   81.0625

(multiply by 8192 because that is the block size.  Divide by 1024/1024 to convert to MB)

 This makes sense.  The table is 80MB and the HWM is 81MB (with a bit of overhead).  Now I will create another table in the same tablespace, moving the HWM towards the end of the datafile.

 create table TEST2 tablespace TEST2 as select * from test;

 Check the HWM again

select (max((block_id + blocks-1)*8192))/1024/1024 "HWM (MB)"
from dba_extents where file_id=6;
  HWM (MB)
----------
  160.0625

 Again, this makes sense.  Now I will drop the first table and check the HWM and the total segment size in the tablespace.

drop table test purge;
Table dropped.
select sum(bytes)/1024/1024 "Segment Size (MB)" from dba_segments
where tablespace_name='TEST2';
Segment Size (MB)
-----------------
               80
select (max((block_id + blocks-1)*8192))/1024/1024 "HWM (MB)"
from dba_extents
where file_id=6;
  HWM (MB)
----------
  160.0625

So, at this point there is only 80MB worth of data in the tablespace but the HWM is 160MB.  If you came into this situation not knowing the history, it would reasonable to think you could shrink the datafile to something a little more than 80MB.  You would be incorrect.  Because the HWM is 160MB, you won’t be able to shrink it lower than that unless you re-org the table.

alter database datafile '/oradata/DB10201/test.dbf' resize 159m;
alter database datafile '/oradata/DB10201/test.dbf' resize 159m
*
ERROR at line 1:
ORA-03297: file contains used data beyond requested RESIZE value
alter database datafile '/oradata/DB10201/test.dbf' resize 161m;
Database altered.

Now lets look at it from a different angle.  Suppose I try to shrink the datafile to 100MB and get the ORA-03297 error.  You can use this query to find out exactly which objects are preventing it from happening.

select segment_name, segment_owner from dba_extents
where file_id=6
and ((block_id + blocks-1)*8192 > 104857600;

Of course, we already know the answer because this is just a silly demo.  There are various ways to reset the HWM for a datafile but I will save that for another blog.

Use Database Profiles to help secure your database

I have seen many times over the years where a production database will experience performance problems that are caused by Adhoc queries running and consuming too many resources.  These will often be from developers trying to debug a problem, or someone in Operations trying to get a custom report, or from some other source.  In extreme cases, this can be, in effect, a classic Denial of Service attack.

One way to solve the problem is to simply restrict access to the production database, particularly if you have a copy of production available for debugging/reporting purposes.  Alternatively, you can implement Database Profiles to restrict the amount of resources any one user is allowed to consume.  They are very easy to implement and can go a long way towards securing your database against an unintentional (or possibly intentional) Denial of Service attack. 

 In order for profiles to work, you must first set the init.ora parameter resource_limit to TRUE.  You can then create profiles and assign them to users.

SQL> alter system set resource_limit=TRUE;
SQL> create profile LIMITED_PROFILE limit sessions_per_user 2;
SQL> create user PROFILE_TEST identified by abcd profile LIMITED_PROFILE;
SQL> grant create session to profile_test;

Now I connect as PROFILE_TEST user, the first two connections are fine, but the third gets:

oracle@mdlinux ~]$ sqlplus profile_test/abcd
ERROR:
ORA-02391: exceeded simultaneous SESSIONS_PER_USER limit

Now that the user can only have two sessions at a given time, I will further crank down the screws by limiting the number of Logical Reads allowed per session.  In a real world scenario, you will need to come up with a reasonable limit but for testing I will set it to 10000.

SQL> alter profile LIMITED_PROFILE limit logical_reads_per_session 10000;

And when they run some crazy query, it will stop them in their tracks

SQL> select count(*) from dba_objects a, dba_objects b;
select count(*) from dba_objects a, dba_objects b
                     *
ERROR at line 1:
ORA-02394: exceeded session limit on IO usage, you are being logged off

To see what you can limit with a profile, run this query:

SQL> select unique RESOURCE_TYPE,RESOURCE_NAME from dba_profiles
SQL> order by 1,2 asc;
 RESOURCE RESOURCE_NAME
------- --------------------------------
KERNEL   COMPOSITE_LIMIT
KERNEL   CONNECT_TIME
KERNEL   CPU_PER_CALL
KERNEL   CPU_PER_SESSION
KERNEL   IDLE_TIME
KERNEL   LOGICAL_READS_PER_CALL
KERNEL   LOGICAL_READS_PER_SESSION
KERNEL   PRIVATE_SGA
KERNEL   SESSIONS_PER_USER
PASSWORD FAILED_LOGIN_ATTEMPTS
PASSWORD PASSWORD_GRACE_TIME
PASSWORD PASSWORD_LIFE_TIME
PASSWORD PASSWORD_LOCK_TIME
PASSWORD PASSWORD_REUSE_MAX
PASSWORD PASSWORD_REUSE_TIME
PASSWORD PASSWORD_VERIFY_FUNCTION
16 rows selected.

The PASSWORD resources are used to manage passwords.  The KERNEL resources are used to manage system resources.  For security reasons, you may want to set the IDLE_TIME limit.  This will prevent users from staying logged into the database for extended periods of time with no activity.  To see which users have been assigned profiles, you can query the dba_users view:

select username, profile from dba_users;

So, if you get a chance, check out Database Profiles.  They are quick, easy and can possibly help protect your database from an unpleasant situation.

crsstat in Oracle 11g doesn’t show the instances

If you are familiar with Oracle Real Application Clusters (RAC) on 10g, you have likely seen output from crsstat that looks something like this.  This is from a 10.2.0.4 two node RAC cluster.  Notice that each instance is listed (RACDB1 and RACDB3) as well as the database itself (RACDB).

 

[oracle@node15 ~]$ crsstat
HA Resource                                   Target     State
-----------                                   ------     -----
ora.RACDB.RACDB1.inst                          ONLINE     ONLINE on node15
ora.RACDB.RACDB3.inst                          ONLINE     ONLINE on node16
ora.RACDB.db                                   ONLINE     ONLINE on node16
ora.node15.LISTENER_node15.lsnr                ONLINE     ONLINE on node15
ora.node15.gsd                                 ONLINE     ONLINE on node15
ora.node15.ons                                 ONLINE     ONLINE on node15
ora.node15.vip                                 ONLINE     ONLINE on node15
ora.node16.LISTENER_node16.lsnr                ONLINE     ONLINE on node16
ora.node16.gsd                                 ONLINE     ONLINE on node16
ora.node16.ons                                 ONLINE     ONLINE on node16
ora.node16.vip                                 ONLINE     ONLINE on node16

Run the same command on an 11.2.0.2 cluster and you will see that the Database resource is shown but not the instances. 

This is from a 11.2.0.2 two node RAC cluster.

[oracle@mdlinux-vm1 ~]$ crsstat
HA Resource                                   Target     State
-----------                                   ------     -----
ora.LISTENER.lsnr                             ONLINE     ONLINE on mdlinux-vm1
ora.LISTENER_SCAN1.lsnr                       ONLINE     ONLINE on mdlinux-vm2
ora.LISTENER_SCAN2.lsnr                       ONLINE     ONLINE on mdlinux-vm1
ora.LISTENER_SCAN3.lsnr                       ONLINE     ONLINE on mdlinux-vm1
ora.asm                                       ONLINE     ONLINE on mdlinux-vm1
ora.cvu                                       ONLINE     ONLINE on mdlinux-vm1
ora.gns                                       ONLINE     OFFLINE
ora.gns.vip                                   ONLINE     ONLINE on mdlinux-vm1
ora.gsd                                       OFFLINE    OFFLINE
ora.mdlinux-vm1.ASM2.asm                      ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm1.LISTENER_MDLINUX-VM1.lsnr     ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm1.gsd                           OFFLINE    OFFLINE
ora.mdlinux-vm1.ons                           ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm1.vip                           ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm2.ASM1.asm                      ONLINE     ONLINE on mdlinux-vm2
ora.mdlinux-vm2.LISTENER_MDLINUX-VM2.lsnr     ONLINE     ONLINE on mdlinux-vm2
ora.mdlinux-vm2.gsd                           OFFLINE    OFFLINE
ora.mdlinux-vm2.ons                           ONLINE     ONLINE on mdlinux-vm2
ora.mdlinux-vm2.vip                           ONLINE     ONLINE on mdlinux-vm2
ora.mdracdb.db                                ONLINE     ONLINE on mdlinux-vm1
ora.net1.network                              ONLINE     ONLINE on mdlinux-vm1
ora.oc4j                                      ONLINE     ONLINE on mdlinux-vm1
ora.ons                                       ONLINE     ONLINE on mdlinux-vm1
ora.scan1.vip                                 ONLINE     ONLINE on mdlinux-vm2
ora.scan2.vip                                 ONLINE     ONLINE on mdlinux-vm1
ora.scan3.vip                                 ONLINE     ONLINE on mdlinux-vm1

If you want to check the status of the instances, you can run the srvctl command.

[oracle@mdlinux-vm1 ~]$ srvctl status database -d mdracdb
Instance mdracdb2 is running on node mdlinux-vm1
Instance mdracdb1 is running on node mdlinux-vm2

So if I were to shutdown one of the instances

[oracle@mdlinux-vm1 ~]$ srvctl stop instance -i mdracdb2 -d mdracdb
[oracle@mdlinux-vm1 ~]$ srvctl status database -d mdracdb
Instance mdracdb2 is not running on node mdlinux-vm1
Instance mdracdb1 is running on node mdlinux-vm2
[oracle@mdlinux-vm1 ~]$ crsstat
HA Resource                                   Target     State
-----------                                   ------     -----
ora.LISTENER.lsnr                             ONLINE     ONLINE on mdlinux-vm1
ora.LISTENER_SCAN1.lsnr                       ONLINE     ONLINE on mdlinux-vm2
ora.LISTENER_SCAN2.lsnr                       ONLINE     ONLINE on mdlinux-vm1
ora.LISTENER_SCAN3.lsnr                       ONLINE     ONLINE on mdlinux-vm1
ora.asm                                       ONLINE     ONLINE on mdlinux-vm1
ora.cvu                                       ONLINE     ONLINE on mdlinux-vm1
ora.gns                                       ONLINE     OFFLINE
ora.gns.vip                                   ONLINE     ONLINE on mdlinux-vm1
ora.gsd                                       OFFLINE    OFFLINE
ora.mdlinux-vm1.ASM2.asm                      ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm1.LISTENER_MDLINUX-VM1.lsnr     ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm1.gsd                           OFFLINE    OFFLINE
ora.mdlinux-vm1.ons                           ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm1.vip                           ONLINE     ONLINE on mdlinux-vm1
ora.mdlinux-vm2.ASM1.asm                      ONLINE     ONLINE on mdlinux-vm2
ora.mdlinux-vm2.LISTENER_MDLINUX-VM2.lsnr     ONLINE     ONLINE on mdlinux-vm2
ora.mdlinux-vm2.gsd                           OFFLINE    OFFLINE
ora.mdlinux-vm2.ons                           ONLINE     ONLINE on mdlinux-vm2
ora.mdlinux-vm2.vip                           ONLINE     ONLINE on mdlinux-vm2
ora.mdracdb.db                                ONLINE     ONLINE on mdlinux-vm2
ora.net1.network                              ONLINE     ONLINE on mdlinux-vm1
ora.oc4j                                      ONLINE     ONLINE on mdlinux-vm1
ora.ons                                       ONLINE     ONLINE on mdlinux-vm1
ora.scan1.vip                                 ONLINE     ONLINE on mdlinux-vm2
ora.scan2.vip                                 ONLINE     ONLINE on mdlinux-vm1
ora.scan3.vip                                 ONLINE     ONLINE on mdlinux-vm1

 

And run crsstat again it will still show the database ONLINE but in this case it shows that it is on mdlinux-vm2 rather than mdlinux-vm1.  Other than that, you would have no idea that anything has changed.  Of course, if you have some sort of monitoring tool in place (like Database Rx) you would be notified in this type of situation.

Use SQLT to diagnose difficult SQL performance issues

 SQLT, also known as SQLTXPLAIN is a tool provided by Oracle Server Technologies Center of Expertise that can be used to diagnose why a particular SQL statement is performing poorly.  It is not like AWR or Statspack which provide a system-wide view of performance.  SQLT is very focused and works on one SQL statement at a time.  SQLT inputs one SQL statement and outputs a set of diagnostics files. These files can be used to diagnose SQL statements performing poorly.  In this blog, I will give an overview of SQLT and how to get started with it. 

In my work as a remote DBA, I am often asked to step into an unfamiliar environment to solve one problem or another.  If that problem is a complicated SQL statement that needs tuning, it can take a lot of time to manually gather all the information you may need.  Questions I would typically start asking myself in that situation:

How many and which tables are involved? 
How big are the objects involved?
How many, what type and what columns are indexed?
How accurate are the optimizer statistics? 
Are there histograms on the columns?
Are there any non-default init.ora parameters that could be causing problems?
What is the execution plan? 
Has it changed recently?

 To gather all of this information and lots, lots more and present it in an easy-to-use format, SQLT is the tool to use.  SQLT connects to the database and collects execution plans, Cost-based Optimizer CBO statistics, schema objects metadata, performance statistics, configuration parameters, and similar elements that influence the performance of the SQL being analyzed.  It collects so much information that I honestly don’t know what some of it even means.   Here is a screen shot that shows the very beginning of a SQLT report. 

 SQLT screenshot

 

The “Observations” section often has some really good information.  Go down further to get more details.

SQLT provides 5 main methods that generate diagnostics details for one SQL statement:

XTRACT Method
Use this method if you know the SQL_ID or the HASH_VALUE of the SQL to be analyzed, else use XECUTE. The SQL_ID can be found on an AWR report, and the HASH_VALUE on any SQL Trace (above the SQL text and identified by the “hv=” token).  If the SQL is still in memory, or it has been captured by AWR, then XTRACT finds it and provides a set of diagnostics files, else XTRACT errors out.

XECUTE Method
This method provides more detail than XTRACT. As the name XECUTE implies, it executes the SQL being analyzed, then it produces a set of diagnostics files. Its major drawback is that if the SQL being analyzed takes long to execute, this method will also take long.  Before you can use this XECUTE method, you have to create a text file that contains your SQL text. If the SQL includes bind variables, your file must contain the bind variable declaration and assignment. Use sqlt/input/sample/script1.sql as an example. Your SQL should contain the token /* ^^unique_id */ which is highly recommended.

XTRXEC Method
This is my preferred way to execute SQLT.  It combines the XTRACT and XECUTE methods and executes them consecutively. The XTRACT phase generates a script that contains the extracted SQL together with the binds declaration and assignment for an expensive plan found for the requested SQL statement. XTRXEC then executes the XECUTE phase using the script created by the first.  The selection of the values of the bind variables used by XTRACT to create the script is based on the peeked values at the moment the most expensive plans in memory were generated. Expensive plans are selected according to their average elapsed time.

XTRSBY Method
Use this method if you need to analyze a SQL executed on a Data Guard or stand-by read-only database. You need to know the SQL_ID or the HASH_VALUE of the SQL to be analyzed. 

XPLAIN Method
This method is based on the EXPLAIN PLAN FOR command, therefore it is blind to bind variables referenced by your SQL statement. Use this method only if XTRACT or XECUTE are not possible. 

Besides the bind peeking limitation on XPLAIN all 5 main methods provide enough diagnostics details to make an initial assessment of a SQL performing poorly. If the SQL still resides in memory or in the Automatic Workload Repository AWR use XTRACT or XTRXEC, else use XECUTE. For Data Guard or standby read-only databases use XTRSBY. Use XPLAIN only if the other methods are not feasible.

In order to get started with SQLT:

Step 1) Download SQLT.
SQLT is attached as a zip file to the Oracle Support Document “SQLT (SQLTXPLAIN) - Tool that helps to diagnose a SQL statement performing poorly [ID 215187.1]“.  Download it to any directory and unzip it

Step 2) Install SQLT:
Execute installation script sqlt/install/sqcreate.sql connected as SYS.  It will prompt you for some information and will create a new schema called SQLTXPLAIN in the database.  If you wish to run SQLT from a remote client, provide the connect string when asked.  Otherwise leave that blank.  You can optionally specify the user that executed the SQL you are trying to diagnose and whether you are licensed for the Tuning Pack and/or Diagnostic Pack.  If you are not licensed for either Diagnostics or Tuning, then SQLT installs with limited functionality.  Still very useful, but limited.  After installing SQLT, you can read the full set of instructions in the sqlt/sqlt_instructions.html file.

Step 3) Execute SQLT:
In this case I assume that I know the SQL ID of the statement that I want to investigate and it is either in memory still or has been captured by AWR.  SQLT is intended to be run as the user that originally executed the problematic SQL but you can actually log in as any user that is able to reproduce the problem.

[oracle@mdlinux run]$ sqlplus mike
SQL> start sqltxtrxec 31gappf8f7kwm

 Step 4) Review the output:
Depending on the method that you used, the output will be zipped up into one or more files.  When I used the XTRXEC method, the filenames were sqlt_s34012_sqlt_s34014_xtrxec.zip and sqlt_s34012_xtract_31gappf8f7kwm.zip.  Unzip those and you will find the html reports: sqlt_s34012_main.html and sqlt_s34014_main.html.  Open these files and you will find more information about your SQL statement than you ever knew existed!  There is also a “lite” version of the html report with a slimmed down analysis of your statements.

So I encourage you to get familiar with SQLT.  In my experience, many SQL performance problems don’t require the level of detail that SQLT produces but it is a very convenient way to gather all of the relevant information.  Also, if you open a Service Request with Oracle Support for a poorly performing SQL statement they will very likely ask you to run it.

Using v$rman_output to review your backups

In my work as a Remote DBA when I implement RMAN backups for customers, I always have the output written to a logfile.  This provides a way to review the results and keep a long-term record of your backups.  But there is also a way to get this information straight out of the database by querying the v$rman_output and v$rman_status views.  The v$rman_output view is memory-based so it will not persist thru a database restart.  The v$rman_status gets its data from the control file so it will persist thru a database restart.  The v$rman_output view provides the exact output from rman as it would appear on the screen.

In the example below, I ran 3 separate commands within a single rman session:  backup controlfile, backup datafile and backup database.  You can see the complete output of the most recent rman session with this query:

select output
from v$rman_output
where session_recid = (select max(session_recid) from v$rman_status)
order by recid ;
connected to target database: DB10201 (DBID=3004633712)
using target database control file instead of recovery catalog
Starting backup at 15-JUN-12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=141 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
channel ORA_DISK_1: starting piece 1 at 15-JUN-12
channel ORA_DISK_1: finished piece 1 at 15-JUN-12
piece handle=/tmp/0bndj6b5_1_1.ctl tag=TAG20120615T084837 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 15-JUN-12
Starting backup at 15-JUN-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00004 name=/oradata/DB10201/users01.dbf
channel ORA_DISK_1: starting piece 1 at 15-JUN-12
channel ORA_DISK_1: finished piece 1 at 15-JUN-12
piece handle=/tmp/0cndj6sh.dbf tag=TAG20120615T085753 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 15-JUN-12
Starting backup at 15-JUN-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=/oradata/DB10201/system01.dbf
input datafile fno=00002 name=/oradata/DB10201/undotbs01.dbf
input datafile fno=00003 name=/oradata/DB10201/sysaux01.dbf
input datafile fno=00004 name=/oradata/DB10201/users01.dbf
channel ORA_DISK_1: starting piece 1 at 15-JUN-12
channel ORA_DISK_1: finished piece 1 at 15-JUN-12
piece handle=/tmp/c0dndj74j_1_1.dbf tag=TAG20120615T090211 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:15
channel ORA_DISK_1: starting compressed full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 15-JUN-12
channel ORA_DISK_1: finished piece 1 at 15-JUN-12
piece handle=/tmp/c0endj76u_1_1.dbf tag=TAG20120615T090211 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 15-JUN-12

54 rows selected.

Or you can review the last couple of days worth of rman output:

select output
from v$rman_output
where session_recid in (select session_recid from v$rman_status
where start_time > sysdate-2)
order by recid ;
 

And you can see some more detailed information about each of the backups in v$rman_status,
including the start/end time and the size of the backup

col object_type format a15
alter session set nls_date_format='DD-MON-YYYY hh24:mi:ss';
select object_type,mbytes_processed, start_time, end_time,status
from v$rman_status
where session_recid = (select max(session_recid) from v$rman_status)
and operation !='RMAN'
order by recid;
OBJECT_TYPE     MBYTES_PROCESSED START_TIME           END_TIME             STATUS
--------------- ---------------- -------------------- -------------------- -----------------------
CONTROLFILE                    7 15-JUN-2012 08:48:36 15-JUN-2012 08:48:42 COMPLETED
DATAFILE FULL                 24 15-JUN-2012 08:57:52 15-JUN-2012 08:57:57 COMPLETED
DB FULL                     2311 15-JUN-2012 09:02:10 15-JUN-2012 09:03:30 COMPLETED

Because the data in v$rman_output goes away with a restart, I think it is still a good idea to maintain log files but this view definitely can come in handy.  It is particularly useful to troubleshoot rman problems after-the-fact when there is no log file available.

Using the Remote Diagnostic Agent (RDA)

The Remote Diagnostic Agent (RDA) is an Oracle-provided diagnostic tool that captures data about your database environment and packages it into a web page report. When submitting a Service Request to Oracle Support, you will often be asked to run the Remote Diagnostic Agent (RDA) and upload the results.  This allows the support analyst to review a lot of information about your system with very little effort on your part.  It saves time by eliminating a lot of back and forth questions.

 

Even when not dealing with Oracle Support, I find RDA is a great tool for understanding an environment and troubleshooting issues.  In my job as a Remote DBA, I often get involved in unfamiliar environments and RDA can allow me to quickly get up to speed. It provides a wealth of information in a format that is easy to use and (mostly) easy to understand.

 

Here are some simple instructions for installing and using RDA

 

On the Oracle Support web site, search for document Id 314422.1 Remote Diagnostic Agent (RDA) 4 – Getting Started

Click on Download RDA and select the appropriate Zip file for your platform

Unzip the file to any location and make it executable (chmod +x)

 

To run RDA, just execute rda.sh from the unzipped location.  Initially. it will prompt for you for information, such as the name of the database and a username that can connect to it.  Follow the prompts and they are mostly self-explantory.

 

[oracle@mdlinux rda]$ ./rda.sh
-------------------------------------------------------------------------------
RDA Data Collection Started 06-Jun-2012 15:00:33
-------------------------------------------------------------------------------
Processing Initialization module ...
Processing OCM module ...
Processing CFG module ...
Processing PERF module ...
Processing OS module ...
Processing PROF module ...
Processing NET module ...
Processing ONET module ...
Listener checks may take a few minutes. please be patient...
Processing Oracle installation module ...
Processing RDBMS module ...
Processing HCVE tests ...
.
.
.
.
.
Processing RDBMS Memory module ...
Processing LOG module ...
Processing RDSP module ...
Processing LOAD module ...
Processing End module ...
-------------------------------------------------------------------------------
RDA Data Collection Ended 06-Jun-2012 15:02:12
 

When completed, you will be able to access your report by opening the <install location>/rda/output/RDA__start.htm in a web browser.   

 

RDA is particularly useful for understanding and troubleshooting RAC problems.  It will gather all relevant log files, init files and diagnostic files together and will document your network configuration, cluster configuration and database(s) all in one place.  It really makes quick(er) work of understanding a complex RAC cluster.  Use it as baseline documentation for your cluster – run it when everything is working as it should be and then when something breaks, you run it again and compare the results.

 

To run RDA for RAC, first configure it. 

[oracle@mdlinux-vm1 rda]$ ./rda.sh -vX RDA::Remote setup_cluster 

You will be asked several questions, including the names of any databases that you want included in the collection.  If all goes well you should see something like this indicating that it is ready to be run on the local and remote node(s). 

RAC Setup Summary
-----------------------------------------------------------------------
Nodes:
. NOD001  mdlinux-vm2/mdracdb1
. NOD002  mdlinux-vm1/mdracdb2
2 nodes found

Then run the data collection

 

 [oracle@mdlinux-vm1 rda]$ ./rda.sh -v -e REMOTE_TRACE=1
    Collecting diagnostic data ...
-------------------------------------------------------------------------------
RDA Data Collection Started 06-Jun-2012 14:24:21
-------------------------------------------------------------------------------
Processing Initialization module ...
Processing OCM module ...
Processing PERF module ...
Processing CFG module ...
Processing Sampling module ...
Processing OS module ...
Processing PROF module ...
Processing NET module ...
Processing ONET module ...
Listener checks may take a few minutes. please be patient...
  Processing listener LISTENER
  Processing listener LISTENER_SCAN1
Processing Oracle installation module ...
Processing Cluster module ...
Processing REXE module ...
NOD002>         Setting up ...
NOD001>         Setting up ...
NOD002>         Collecting diagnostic data ...
.
.
. 

This continues for a while…When completed, you will be able to access your report by opening the <install location>/rda/output/RDA__start.htm in a web browser.  There is an incredible amount of useful information contained in the report.  I encourage you to become familiar with this tool and be prepared to run it (almost) every time you open a Service Request with Oracle.

RMAN on Linux made easy with our downloadable scripts

RMAN is the obvious choice for performing hot backups and managing the flash recovery area but the plethora of options go well beyond a basic “backup database” command.  Rather than starting from scratch with scripting, adding options one at a time, and going through several cycles of needless debugging, why not use our Linux bash script and accompanying cross-platform RMAN command file complete with error handling and comments.  The RMAN command file works like a template, making it easy to activate the options you want and remove the one you don’t.  The embedded comments help to clarify your backup strategy and serve as a training tool.  If you’re not totally comfortable with RMAN yet these scripts will get you there in a jiffy.

Installing 11g Grid Control on Linux

I recently installed Oracle Grid Control 11g on my linux workstation. Below are the steps that I went through to get it working.  Prior to installing Control, I had already installed Grid Infrastructure 11.2.0.2 and Oracle Database 11.2.0.2 and created a database to be used as the Grid Control repository.

Step 1) Download the Grid Control software:
- go to technet.oracle.com, click Download and select Enterprise Manager
- Select Enterprise Manager 11g Grid Control Release 1 (11.1.0.1.0)
- select Linux x86-64 and download all 3 files:
GridControl_11.1.0.1.0_Linux_x86-64_1of3.zip
GridControl_11.1.0.1.0_Linux_x86-64_2of3.zip
GridControl_11.1.0.1.0_Linux_x86-64_3of3.zip
- unzip all 3 files into a staging location

Step 2) Download and Install Oracle Web Logic Server

- go to technet.oracle.com, click Download and select Fusion Middleware (incl. Weblogic Server)
- Under “Oracle Weblogic Server 10.3.6 and Previous Releases”, click “see all releases”
- select Oracle Weblogic Server 11gR1 (10.3.2) - Package Installer and download the file under Linux (32-bit JVM).

The filename is wls1032_linux32.bin and is a self-extracting executable so to launch the installation, just run ./wls1032_linux32.bin. This will bring up a GUI installation program. I accepted the defaults and installed into /opt/oracle/product/Middleware.  I was prompted to specify 2 passwords during the install: weblogic and nodemanager.

Step 3) Install Grid Control

From the staging location where you unzipped the files, execute ./runInstaller to bring up the Grid Control Installation GUI.  The first screen prompts you to enter your Oracle Support credentials so you can download and install the updates. If you have a valid Oracle Support contract, I suggest you take this option as it saves you the trouble of separately installing any recommended/mandatory patches.  After selecting this option, the Installer connected to Oracle Support, downloaded several patches and proceeded with the main installation.  During the installation, I ran into a few problems that required attention before it would complete:

  • compat-db-4.1.25 was not installed so I installed it from my Linux software distribution CD
  • Database Control had been configured in my repository database and had to be removed: 
    emca -deconfig dbcontrol db -repos drop -SYS_PWD <sys password> -SYSMAN_PWD <sysman password>
  • Incorrect database settings:
    session_cached_cursors >=200
    log_buffer >=10485760
    processes >=500
    UNDO tablespace >=200M

After fixing each of these issues, I proceeded with the installation and Grid Control 11g was successfully installed.  I was prompted to specify 2 passwords during the installation: sysman and the agent registration password

Step 4) Install Grid Control Agent on each node that you want to manage

 Before Grid Control can manage a database, you must first install the Grid Control agent on the database server.  The agent installation software installs with Grid Control so you can copy it from there to each database server and then install it.  The installers are platform specific, so in my case, I copied $GC_HOME/sysman/agent_download/11.1.0.1.0/linux_x64/agentDownload.linux_x64 to a staging location on each of my RAC nodes.  I then installed the agent with the following command, specifying the installation directory as well as the names of my cluster and nodes.  The “-y” at the end will bypass the software updates portion of the installation.

 ./agentDownload.linux_x64 -b /u01/app/11.2.0/agent11g -n mdlinux-cluster -c "mdlinux-vm1,mdlinux-vm2" -y

Because I had created a registration password during the installation of Grid Control, I also need to secure the agent:

emctl secure agent
emctl start agent
emctl upload agent

Step 5) Discover the nodes in Grid Control

After you configure the Grid Control agent and it uploads its data, the hosts should become visible in Grid Control and you can then add the databases.

Preserving Statistics During Export / Import

Some database migrations are still performed using export / import (either datapump or the original export / import); for both of these the optimizer statistics are preserved during the export / import but there are a few subtleties to be aware of. For example, if you run dbms_stats.gather_schema_stats shortly after an import you will typically lose most of the existing column histograms, obviously if the histograms are important their loss will cause problems.

Let us discuss (briefly) why / how this could happen. The default statistics gathering routine in Oracle (since version 10) is for the database to run a scheduled job during a maintenance window. In 10g you can see this job via

select owner, job_name, enabled, state, last_start_date from dba_scheduler_jobs where job_name = ‘GATHER_STATS_JOB’

in 11g via

select task_name, status, last_good_date from dba_autotask_task where task_name = ‘gather_stats_prog’

This job (effectively) calls dbms_stats with the default of Oracle recommended setting for the ‘method_opt’ parameter which is ‘for all columns size auto’ which according to the documentation “this setting means that DBMS_STATS decides which columns to add histogram to where it believes that they may help to produce a better plan.” Although it is not documented in great detail the ‘decides’ part of this depends upon two things

a) If Oracle detects ’skew’ in the values in a column
b) If that column is used in the WHERE clause

In other words and here is the key point, by default, Oracle will decide what columns should have histograms based on both the skew and the workload and immediately after an initial import the workload history is missing. This ‘workload history’ is actually recorded in the table sys.col_usage$, you can inspect it using the query

select O.OBJECT_NAME, C.NAME, CU.*
from SYS.COL$ C,
     SYS.COL_USAGE$ CU,
     DBA_OBJECTS O
where C.OBJ# = CU.OBJ# AND
      C.INTCOL# = CU.INTCOL# AND
      C.OBJ# = O.OBJECT_ID AND
      O.OWNER = 'YOUR_USER'
order by 1,2
OBJECT_NAME NAME EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS
SAMPLES DATE_PURGED 0 0 0 6 0 1127
SAMPLES INSTANCE_ID 55378 40025 0 0 0 0
SAMPLES SAMPLE_ID 447227 535631 0 26717 0 1
SAMPLES SAMPLE_TYPE 434057 267260 0 0 0 116439

Given this here are some suggestions, the examples use a single schema but obviously could be extended to multiple schemas easily enough.

1.
Prior to export save away the existing statistics using the commands
execute DBMS_STATS.CREATE_STAT_TABLE(’YOUR_USER’,'EXP_STATS’);
execute DBMS_STATS.EXPORT_SCHEMA_STATS(’YOUR_USER’,'EXP_STATS’);
You should not need to restore these statistics but it is better to have them available if required

2.
Prior to export, record the number of histograms that currently exist using

select count(*)
from dba_tab_col_statistics
where histogram <> ‘NONE’ and
owner = ‘YOUR_USER’

3.
After completion of the import, check the same histograms exist i.e. run the same query as in 2 on the imported database. If the number of histograms differs significantly or you encountered other issues with the statistics you can import the statistics explicity via
execute DBMS_STATS.IMPORT_SCHEMA_STATS(’YOUR_USER’,'EXP_STATS’);
under most conditions you will not need to do this.

4.
If the number of histograms match then (optionally) re-gather the statistics keeping the existing histograms using the ‘repeat’ option for method_opt i.e.

execute dbms_stats.gather_schema_stats(ownname=>’YOUR_USER’,ESTIMATE_PERCENT=>100,
METHOD_OPT=>’FOR ALL COLUMNS SIZE REPEAT’);

(Obviously you should adjust the estimate_percent depending upon your data volume).

5.
Confirm that you have the same histograms as before using the query in 2.

You will now have updated statistics but have preserved the previously generated histograms. Over time Oracle will fill in the col_usage$ information based on the workload which should ensure the histograms that are valuable are preserved.

Overview of Oracle Restart commands

Oracle Restart is a new feature of Oracle 11 that will automatically restart certain Oracle processes in the event of a failure.  A manual shutdown will not cause it to restart, as Oracle assumes you intend for it to be down.  This is a great feature and can really help minimize downtime in certain cases.  Imagine your production database crashes in the middle of the night.  By the time your pager wakes you up and you login, your database can already have restarted itself.  Of course, you would need to do a post-mortem to figure out why it crashed but it can save precious minutes when high availability is important. It is also responsible for starting Oracle when your server is rebooted.  This takes the place of the dbora start/stop scripts that have been around for many years. 

Oracle Restart (aka “High Availability Services” aka “has”) is controlled at a high level by the crsctl command.  If you have administered a RAC cluster, then you are already familiar with this command as it is used to manage Cluster Ready Services (crs).  Use “enable” and “disable” to have HAS start automatically when the server is rebooted.  Use “stop” and “start” to start or stop HAS manually.

$ crsctl config has
CRS-4621: Oracle High Availability Services autostart is disabled.
$ crsctl enable has
CRS-4622: Oracle High Availability Services autostart is enabled.
$ crsctl disable has
CRS-4621: Oracle High Availability Services autostart is disabled.
$ srvctl config database -d DB11202
PRCD-1027 : Failed to retrieve database DB11202
PRCR-1115 : Failed to find entities of type resource that match filters
((NAME == ora.db11202.db) && (TYPE == ora.database.type)) and contain attributes
VERSION,ORACLE_HOME,DATABASE_TYPE
Cannot communicate with crsd

In this case, HAS was not running so I receive the error.  I can use “crsctl start has” to start HAS as well as all of the services that it controls.

$ crsctl start has
$ srvctl status database -d DB11202
Database is running.

Now that HAS is running, I can use it to manage resources with srvctl: 

$ srvctl config database -d DB11202
Database unique name: DB11202
Database name: DB11202
Oracle home: /opt/oracle/product/database/11.2.0.2
Oracle user: oracle
Spfile:
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Disk Groups:
Services:
$ srvctl start database -d DB11202
$ srvctl status database -d DB11202
Database is running.
$ srvctl stop database -d DB11202
$ srvctl status database -d DB11202
Database is not running.

You can do the same with the listener and your ASM instance:

$ srvctl status listener
Listener LISTENER is enabled
Listener LISTENER is not running
$ srvctl start listener
$ srvctl status listener
Listener LISTENER is enabled
Listener LISTENER is running on node(s): mdlinux
$ srvctl status asm
ASM is running on mdlinux
$ srvctl stop asm
$ srvctl start asm