Beware of the bug 16299065 when doing RMAN duplicate – you might end up with restoring on SOURCE database

This issue has happened on one of our customers environments where RMAN duplicate is a common practice for cloning databases. Later on I have successfully reproduced the same scenario on my virtual machine.

Background

A delivery had been applied to database, including some structural changes like creation of new tablespace with multiple datafiles. During testing someone had put one of the new datafiles offline. After some time customer did a request of additional database with the same release applied and RMAN duplicate from backup approach was used for that purpose.

Database version: 11.2.0.3 with 11.2.0.3.6 PSU applied on top

The problem

It was not possible to duplicate database due to error for one of the tablespaces that couldn’t be found:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 09/25/2013 15:17:58
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of catalog command on clone_default channel at 09/25/2013 15:17:58
ORA-19625: error identifying file /u01/oradata/*****/datafile/****_%u_.dbf
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

But the most horrifying was the fact that SOURCE database has crashed with the following:

ORA-01122: database file * failed verification check
ORA-01110: data file *: '/u01/oradata/*****/datafile/*****_943dn0db_.dbf'
ORA-01206: file is not part of this database - wrong database id

It is obvious that something had happened with DBID for problematic datafile. Let’s try to find it out by setting up a test case.

Test case

1) Create a test database (steps not shown for obvious reasons)

2) Create a tablespace with two datafiles

SQL> CREATE TABLESPACE EXAMPLE DATAFILE SIZE 10M;

Tablespace created.

SQL> ALTER TABLESPACE EXAMPLE ADD DATAFILE SIZE 10M;

Tablespace altered.

3) Make one of the EXAMPLE tablespace datafiles offline

SQL> COLUMN NAME FORMAT A60
SQL> SELECT * FROM V$DBFILE;

FILE# NAME
---------- ------------------------------------------------------------
 4 /u01/oradata/TESTDG/datafile/o1_mf_users_933cyyrl_.dbf
 3 /u01/oradata/TESTDG/datafile/o1_mf_undotbs1_933cyyr4_.dbf
 2 /u01/oradata/TESTDG/datafile/o1_mf_sysaux_933cyyqw_.dbf
 1 /u01/oradata/TESTDG/datafile/o1_mf_system_933cyypn_.dbf
 5 /u01/oradata/TESTDG/datafile/o1_mf_example_945t6b0w_.dbf
 6 /u01/oradata/TESTDG/datafile/o1_mf_example_945t6mwl_.dbf

6 rows selected.

SQL> ALTER DATABASE DATAFILE '/u01/oradata/TESTDG/datafile/o1_mf_example_945t6mwl_.dbf' OFFLINE;

Database altered.

! Notice that we have made datafile 6 offline.
4) Backup database (backup database plus archivelog)

5) You can now bring the offline’d datafile back online (optional)

SQL> RECOVER DATAFILE 6;
Media recovery complete.
SQL> ALTER DATABASE DATAFILE '/u01/oradata/TESTDG/datafile/o1_mf_example_945t6mwl_.dbf' ONLINE;

Database altered.

6) Duplicate database from taken backup (duplicate preparation steps are not shown here as well as rman output has been cut)

[oracle@ocm-rac1 oradata]$ rman auxiliary /

Recovery Manager: Release 11.2.0.3.0 - Production on Wed Sep 25 17:01:29 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

connected to auxiliary database: TESTDG2 (not mounted)

RMAN> DUPLICATE DATABASE TO TESTDG2 BACKUP LOCATION '/u01/app/oracle/fast_recovery_area/TESTDG/backupset/2013_09_25';

Starting Duplicate Db at 25-SEP-13

contents of Memory Script:
{
 sql clone "create spfile from memory";
}
executing Memory Script

sql statement: create spfile from memory

contents of Memory Script:
{
 shutdown clone immediate;
 startup clone nomount;
}
executing Memory Script

Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

..........

cataloged datafile copy
datafile copy file name=/u01/oradata/TESTDG2/datafile/o1_mf_sysaux_945v7qwn_.dbf RECID=1 STAMP=827082187
cataloged datafile copy
datafile copy file name=/u01/oradata/TESTDG2/datafile/o1_mf_undotbs1_945v7qwo_.dbf RECID=2 STAMP=827082187
cataloged datafile copy
datafile copy file name=/u01/oradata/TESTDG2/datafile/o1_mf_users_945v7qwy_.dbf RECID=3 STAMP=827082187
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 09/25/2013 17:03:07
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of catalog command on clone_default channel at 09/25/2013 17:03:07
ORA-19625: error identifying file /u01/oradata/TESTDG2/datafile/o1_mf_example_%u_.dbf
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

7) Check the SOURCE database. If that has not crashed, try to shut it down

SQL> shu immediate;
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/u01/oradata/TESTDG/datafile/o1_mf_example_945t6b0w_.dbf'
ORA-01206: file is not part of this database - wrong database id

Here it goes! RMAN Duplicate has wiped out DBID of the other datafile that originally was not OFFLINE (datafile 5). It has obviously mixed up the datafiles. Let’s query the DBID for them:

SQL> select HXFIL file#, FHDBI dbid from x$kcvfh;

     FILE#	 DBID
---------- ----------
	 1 2758795775
	 2 2758795775
	 3 2758795775
	 4 2758795775
	 5	    0
	 6 2758795775

6 rows selected.

Now what? DBID zeroed out! Restore and recover the datafile and acknowledge another bug from Oracle 😉 As per MOS, it is fixed in 11.2.0.4 and 12.2 releases, however you can apply 16299065 a one-off patch on top of 11.2.0.3/11.2.0.3.6 already now.

Advertisement

One thought on “Beware of the bug 16299065 when doing RMAN duplicate – you might end up with restoring on SOURCE database

  1. Hi Andrejs,

    Pretty interesting case !
    It makes sense datafile 6 to be offline at the AUX instance (which was kept offline during the backup). You haven’t said anything about the set until SCN used for the backup based (pull method) rman duplicate ?
    I will try to reproduce in my lab environment => (point of interest is, if some normal offline checkpoint information is kept in tablespace file #1 in your case file 5 for file 6). I have encountered such cases with TSPITR and manually managed auxiliary instance (when I forgot to online some datafiles before the recovery phase).

    BR
    Ognyan Istatkov

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s