Tuesday, October 6, 2009

An introduction to rman with examples

For beginners and non, rman is one of the most obscure Oracle tools, but in reality it is based on quite simple principles. It is true, though, that a restoration can be distructive for a database.
Why is rman dangerous?
rman is dangerous because it works at the database physical level; to understand why, let's say that rman is not "too" different from WinZip. Since everybody knows WinZip, it will probably be easier to understand rman.
Is WinZip dangerous?
The creation of a zip archive is not dangerous, unless you choose the same name for the output file, or the archive is so big that you fill a device or the temporary directory etc. The same thing can be said about rman: taking an rman backup cannot be dangerous (unless you stop the database, or close it intentionally) and you don't overwrite previous backups (you can prevent it).
rman creates a compressed backup of the physical database files, including controlfiles, datafiles, archived logs and stores them somewhere. This somewhere can be a disk (like WinZip) or a tape.
Therefore, the first important point to remember is:

rman creates compressed backups of the physical database; if you prefer, rman zips the database, entirely or not.
Is Unzip dangerous?
Unzip, on the other hand, can be very dangerous. The risk is overwriting files that should not be overwritten and, therefore, losing important content. The very same applies to rman restores: since they overwrite database files, you should know whether this is the right decision because, otherwise, you may lose your data. This is the main difference between an import and an rman restoration: an rman restoration (restore) overwrites one or more datafiles; this means WATCH OUT.
Our first rman backup
The only requirements for taking simple, useful rman backups are two:
·    You database is in archivelog mode
·    You have enough disk space somewhere
On our Windows 2000 workstation we open a CMD window and type the following few lines:
E:\>rman target /

Recovery Manager: Release 9.2.0.1.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ARK9201 (DBID=3564144589)

run {
 allocate channel t1 type disk;
 backup database format 'E:\RMAN_BACKUPS\ARK9201\%d_%u_%s';
 release channel t1;
}

using target database controlfile instead of recovery catalog
allocated channel: t1
channel t1: sid=9 devtype=DISK

Starting backup at 10-DEC-04
channel t1: starting full datafile backupset
channel t1: specifying datafile(s) in backupset
including current SPFILE in backupset
including current controlfile in backupset
released channel: t1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on t1 channel at 12/10/2004 14:28:50
ORA-19602: cannot backup or copy active file in NOARCHIVELOG mode
As we said, the database must be in archive mode if we want to backup open databases. A separate articles explains why it must be so, because it is not so obvious.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area 135338868 bytes
Fixed Size 453492 bytes
Variable Size 109051904 bytes
Database Buffers 25165824 bytes
Redo Buffers 667648 bytes
Database mounted.
SQL> alter database archivelog;

Database altered.

SQL> alter database open;

Database altered.
Let's now repeat the operation having put the database in archive mode Choosing C:\Temp as directory for the backup is, of course, a very bad idea. Backups are gold and shoud be kept on a dedicated device; a good name could be E:\RMAN_BACKUPS\SID
E:\>rman target /

Recovery Manager: Release 9.2.0.1.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ARK9201 (DBID=3564144589)

RMAN> run {
2> allocate channel t1 type disk;
3> backup database format 'E:\RMAN_BACKUPS\ARK9201\%d_%u_%s';
4> release channel t1;
5> }

allocated channel: t1
channel t1: sid=12 devtype=DISK

Starting backup at 26-DEC-04
channel t1: starting full datafile backupset
channel t1: specifying datafile(s) in backupset
including current SPFILE in backupset
including current controlfile in backupset
input datafile fno=00001 name=E:\ORACLE\ORADATA\ARK920\SYSTEM01.DBF
input datafile fno=00002 name=E:\ORACLE\ORADATA\ARK920\UNDOTBS01.DBF
input datafile fno=00004 name=E:\ORACLE\ORADATA\ARK920\EXAMPLE01.DBF
input datafile fno=00009 name=E:\ORACLE\ORADATA\ARK920\XDB01.DBF
input datafile fno=00005 name=E:\ORACLE\ORADATA\ARK920\INDX01.DBF
input datafile fno=00008 name=E:\ORACLE\ORADATA\ARK920\USERS01.DBF
input datafile fno=00003 name=E:\ORACLE\ORADATA\ARK920\DRSYS01.DBF
input datafile fno=00006 name=E:\ORACLE\ORADATA\ARK920\ODM01.DBF
input datafile fno=00007 name=E:\ORACLE\ORADATA\ARK920\TOOLS01.DBF
channel t1: starting piece 1 at 26-DEC-04
channel t1: finished piece 1 at 26-DEC-04
piece handle=E:\RMAN_BACKUPS\ARK9201\ARK920_03G8KC6G_3 comment=NONE
channel t1: backup set complete, elapsed time: 00:25:58
Finished backup at 26-DEC-04

released channel: t1
The resulting backup file(s) are in our case
dir c:\rman_backups\ark9201
 Volume in drive C is 80-01-14A2
 Volume Serial Number is 64DA-0BF7

 Directory of c:\rman_backups\ark9201

12/26/2004 02:29p  .
12/26/2004 02:29p  ..
12/26/2004 02:44p 418,471,936 ARK920_03G8KC6G_3
Our first archive backup
Taking only the database backup is not enough, for reasons that will be clear later. We must take a backup of the archived log as well, and the command is very similar
RMAN> run {
2> allocate channel t1 type disk;
3> backup archivelog all delete input format 'C:\RMAN_BACKUPS\ARK9201\arch_%d_%u_%s';
4> release channel t1;
5> }

allocated channel: t1
channel t1: sid=12 devtype=DISK

Starting backup at 26-DEC-04
current log archived
channel t1: starting archive log backupset
channel t1: specifying archive log(s) in backup set
input archive log thread=1 sequence=77 recid=1 stamp=545945036
input archive log thread=1 sequence=78 recid=2 stamp=545945039
input archive log thread=1 sequence=79 recid=3 stamp=545945099
channel t1: starting piece 1 at 26-DEC-04
channel t1: finished piece 1 at 26-DEC-04
piece handle=C:\RMAN_BACKUPS\ARK9201\ARCH_ARK920_04G8KTGL_4 comment=NONE
channel t1: backup set complete, elapsed time: 00:00:37
channel t1: deleting archive log(s)
archive log filename=E:\ORA920\RDBMS\ARC00077.001 recid=1 stamp=545945036
archive log filename=E:\ORA920\RDBMS\ARC00078.001 recid=2 stamp=545945039
archive log filename=E:\ORA920\RDBMS\ARC00079.001 recid=3 stamp=545945099
Finished backup at 26-DEC-04

released channel: t1

RMAN>
rman on UNIX
rman works the same way on a UNIX server, using an appropriate directory format:

RMAN>  run { allocate channel t1 type disk;
2> backup database format '/u01/oracle/archive/%d_%u_%s';
3> backup current controlfile format '/u01/oracle/archive/%d_%u_%s';
4> release channel t1;}

using target database controlfile instead of recovery catalog
allocated channel: t1
channel t1: sid=26 devtype=DISK

Starting backup at 14:29:22 18-jan-2005
channel t1: starting full datafile backupset
channel t1: specifying datafile(s) in backupset
input datafile fno=00001 name=/u01/oradata/ORA920/system01.dbf
input datafile fno=00002 name=/u02/oradata/ORA920/undo01.dbf
input datafile fno=00004 name=/u01/oradata/OMF_DATA_ORA920/o1_mf_tbs_data_0ybcd28z_.dbf
input datafile fno=00003 name=/u01/oradata/OMF_DATA_ORA920/o1_mf_tools_0ybbwkp7_.dbf
input datafile fno=00005 name=/u01/oradata/ORA920/oem_repository.dbf
channel t1: starting piece 1 at 14:29:24 18-jan-2005
channel t1: finished piece 1 at 14:31:10 18-jan-2005
piece handle=/u01/oracle/archive/ORA920_01gah0q3_1 comment=NONE
channel t1: backup set complete, elapsed time: 00:01:48
Finished backup at 14:31:11 18-jan-2005

Starting backup at 14:31:11 18-jan-2005
channel t1: starting full datafile backupset
channel t1: specifying datafile(s) in backupset
including current controlfile in backupset
channel t1: starting piece 1 at 14:31:15 18-jan-2005
channel t1: finished piece 1 at 14:31:18 18-jan-2005
piece handle=/u01/oracle/archive/ORA920_02gah0th_2 comment=NONE
channel t1: backup set complete, elapsed time: 00:00:05
Finished backup at 14:31:18 18-jan-2005

Starting Control File Autobackup at 14:31:18 18-jan-2005
piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-00 comment=NONE
Finished Control File Autobackup at 14:31:22 18-jan-2005

released channel: t1

RMAN> host ;

oracle@edc-se[on pts/4] ls -ltr /u01/oracle/archive/
total 562128
-rw-r-----   1 oracle   dba      282476544 Jan 18 14:31 ORA920_01gah0q3_1
-rw-r-----   1 oracle   dba      5177344 Jan 18 14:31 ORA920_02gah0th_2
oracle@edc-se[on pts/4] exit
host command complete
Backup the controlfile
RMAN>  run { allocate channel t1 type disk;
2> backup current controlfile format '/u01/oracle/archive/controlf_%d_%u_%s';
3>  release channel t1;}

allocated channel: t1
channel t1: sid=26 devtype=DISK

Starting backup at 14:33:59 18-jan-2005
channel t1: starting full datafile backupset
channel t1: specifying datafile(s) in backupset
including current controlfile in backupset
channel t1: starting piece 1 at 14:34:00 18-jan-2005
channel t1: finished piece 1 at 14:34:03 18-jan-2005
piece handle=/u01/oracle/archive/controlf_ORA920_04gah12n_4 comment=NONE
channel t1: backup set complete, elapsed time: 00:00:04
Finished backup at 14:34:03 18-jan-2005

Starting Control File Autobackup at 14:34:03 18-jan-2005
piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-01 comment=NONE
Finished Control File Autobackup at 14:34:07 18-jan-2005

released channel: t1
Here is a backup of archived log files

RMAN>  run { allocate channel t1 type disk;
2> backup archivelog all format '/u01/oracle/archive/archive_%d_%u_%s' delete input;
3> release channel t1;}

allocated channel: t1
channel t1: sid=26 devtype=DISK

Starting backup at 14:42:00 18-jan-2005
current log archived
channel t1: starting archive log backupset
channel t1: specifying archive log(s) in backup set
input archive log thread=1 sequence=9 recid=1 stamp=547477014
input archive log thread=1 sequence=10 recid=2 stamp=547562627
input archive log thread=1 sequence=11 recid=3 stamp=547562680
input archive log thread=1 sequence=12 recid=4 stamp=547580133
input archive log thread=1 sequence=13 recid=5 stamp=547623518
input archive log thread=1 sequence=14 recid=6 stamp=547744760
input archive log thread=1 sequence=15 recid=7 stamp=547865707
input archive log thread=1 sequence=16 recid=8 stamp=547915324
channel t1: starting piece 1 at 14:42:08 18-jan-2005
channel t1: finished piece 1 at 14:43:03 18-jan-2005
piece handle=/u01/oracle/archive/archive_ORA920_06gah1hv_6 comment=NONE
channel t1: backup set complete, elapsed time: 00:00:56
channel t1: deleting archive log(s)
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_9.arc recid=1 stamp=547477014
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_10.arc recid=2 stamp=547562627
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_11.arc recid=3 stamp=547562680
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_12.arc recid=4 stamp=547580133
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_13.arc recid=5 stamp=547623518
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_14.arc recid=6 stamp=547744760
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_15.arc recid=7 stamp=547865707
archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_16.arc recid=8 stamp=547915324
Finished backup at 14:43:04 18-jan-2005

Starting Control File Autobackup at 14:43:04 18-jan-2005
piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-02 comment=NONE
Finished Control File Autobackup at 14:43:09 18-jan-2005

released channel: t1
The rman clauses from logseq and until logseq
Sometimes you will want to backup only certain archived logs and not all. This is the case, for example, when you ran out of space and you were forced to move the archived logs on disk onto another directory. In this case the clause "archivelog all" would give an error, because rman will not be able to find the files.

run {
     allocate channel dev1 type 'sbt_tape';
     backup archivelog
          from logseq 12 until logseq 15 thread 1
          delete input;
}
Keeping an eye on the list above, the command would back up the archived redo logs from ORA920_1_12.arc to ORA920_1_15.arc Remember that the name depends on the parameter ... in your init.ora In our case the values is ...
RMAN> show
2> all;

RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/optware/oracle/9.2.0.1/dbs/snapcf_ORA920.f'; # default
RMAN> list backupset ;
List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
1       Full    269M       DISK        00:01:41     14:31:04 18-jan-2005
        BP Key: 1   Status: AVAILABLE   Tag: TAG20050118T142923
        Piece Name: /u01/oracle/archive/ORA920_01gah0q3_1
  List of Datafiles in backup set 1
  File LV Type Ckp SCN    Ckp Time             Name
  ---- -- ---- ---------- -------------------- ----
  1       Full 562946     14:29:24 18-jan-2005 /u01/oradata/ORA920/system01.dbf
  2       Full 562946     14:29:24 18-jan-2005 /u02/oradata/ORA920/undo01.dbf
  3       Full 562946     14:29:24 18-jan-2005 /u01/oradata/OMF_DATA_ORA920/o1_mf_tools_0ybbwkp7_.dbf
  4       Full 562946     14:29:24 18-jan-2005 /u01/oradata/OMF_DATA_ORA920/o1_mf_tbs_data_0ybcd28z_.dbf
  5       Full 562946     14:29:24 18-jan-2005 /u01/oradata/ORA920/oem_repository.dbf

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
2       Full    4M         DISK        00:00:03     14:31:16 18-jan-2005
        BP Key: 2   Status: AVAILABLE   Tag: TAG20050118T143112
        Piece Name: /u01/oracle/archive/ORA920_02gah0th_2
  Controlfile Included: Ckp SCN: 563054       Ckp time: 14:31:13 18-jan-2005

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
3       Full    4M         DISK        00:00:02     14:31:20 18-jan-2005
        BP Key: 3   Status: AVAILABLE   Tag:
        Piece Name: /optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-00
  Controlfile Included: Ckp SCN: 563071       Ckp time: 14:31:18 18-jan-2005

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
4       Full    4M         DISK        00:00:03     14:34:02 18-jan-2005
        BP Key: 4   Status: AVAILABLE   Tag: TAG20050118T143359
        Piece Name: /u01/oracle/archive/controlf_ORA920_04gah12n_4
  Controlfile Included: Ckp SCN: 563233       Ckp time: 14:33:59 18-jan-2005

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
5       Full    4M         DISK        00:00:02     14:34:06 18-jan-2005
        BP Key: 5   Status: AVAILABLE   Tag:
        Piece Name: /optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-01
  Controlfile Included: Ckp SCN: 563233       Ckp time: 14:33:59 18-jan-2005

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ --------------------
6       203M       DISK        00:00:56     14:43:03 18-jan-2005
        BP Key: 6   Status: AVAILABLE   Tag: TAG20050118T144206
        Piece Name: /u01/oracle/archive/archive_ORA920_06gah1hv_6

  List of Archived Logs in backup set 6
  Thrd Seq     Low SCN    Low Time             Next SCN   Next Time
  ---- ------- ---------- -------------------- ---------- ---------
  1    9       118838     12:18:25 13-jan-2005 124198     12:56:51 13-jan-2005
  1    10      124198     12:56:51 13-jan-2005 208951     12:43:38 14-jan-2005
  1    11      208951     12:43:38 14-jan-2005 209145     12:44:30 14-jan-2005
  1    12      209145     12:44:30 14-jan-2005 227914     17:35:21 14-jan-2005
  1    13      227914     17:35:21 14-jan-2005 271564     05:38:29 15-jan-2005
  1    14      271564     05:38:29 15-jan-2005 392830     15:19:13 16-jan-2005
  1    15      392830     15:19:13 16-jan-2005 514138     00:55:00 18-jan-2005
  1    16      514138     00:55:00 18-jan-2005 563727     14:42:00 18-jan-2005

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
7       Full    4M         DISK        00:00:02     14:43:07 18-jan-2005
        BP Key: 7   Status: AVAILABLE   Tag:
        Piece Name: /optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-02
  Controlfile Included: Ckp SCN: 563788       Ckp time: 14:43:03 18-jan-2005


  RMAN>
Are we sure that our backups do not contain corrupted blocks
There is, of course, little point in taking backups if we don't know whether they are physically usable, i.e. all their blocks are clean. rman offer the command validate
RMAN> validate backupset 6;

allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=26 devtype=DISK
channel ORA_DISK_1: starting validation of archive log backupset
channel ORA_DISK_1: restored backup piece 1
piece handle=/u01/oracle/archive/archive_ORA920_06gah1hv_6 tag=TAG20050118T144206 params=NULL
channel ORA_DISK_1: validation complete

RMAN> validate backupset 5;

using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile backupset
channel ORA_DISK_1: restored backup piece 1
piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-01 tag=null params=NULL
channel ORA_DISK_1: validation complete

RMAN>  validate backupset 4;

using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile backupset
channel ORA_DISK_1: restored backup piece 1
piece handle=/u01/oracle/archive/controlf_ORA920_04gah12n_4 tag=TAG20050118T143359 params=NULL
channel ORA_DISK_1: validation complete

Clause not backed up since time
There are cases when backups fail and the stand becomes unclear. To avoid any risk the clause not backed up since time can be used. This commands makes rman save only the datafiles (or the archived logs) that haven't be backed up for a certain span of time and skip the rest.
RMAN> run { allocate channel t1 type disk;
2>  backup database format '/u01/oracle/archive/%d_%u_%s' not backed up since time 'sysdate -5';
3> release channel t1;}

allocated channel: t1
channel t1: sid=25 devtype=DISK

Starting backup at 15:41:31 21-jan-2005
skipping datafile 1; already backed up on 20:38:51 18-jan-2005
skipping datafile 2; already backed up on 20:38:51 18-jan-2005
skipping datafile 3; already backed up on 20:38:51 18-jan-2005
skipping datafile 4; already backed up on 20:38:51 18-jan-2005
skipping datafile 5; already backed up on 20:38:51 18-jan-2005
Finished backup at 15:41:33 21-jan-2005

released channel: t1
An examples on how to restore a data file can be found on the first article about restoration

The basics of an rman restore


Care is needed for rman restores because they will normally overwrite existing datafiles; before taking any action it is imperative to understand why a restore is needed and how to proceed.
In our opinion it is better to explain basic rman concepts with understandable examples.
Example 1: a restoration is needed because a datafile is lost.
1. In order to restore a database or a data file some kind of backup must have been taken; in our exaple, let's suppose that our database consists of three datafiles and that a backup is taken on Mar 10th 2005 in the afternoon.  ">

The stand of the datafiles can differ because in general it is not guaranteed that the very last changes in the database have actually been written to disk. This is why the timestamps 15:09 and 15:15 have been used in the picture.

2. During the week archive log backups are taken daily, for example to tape using the command:

run {
     allocate channel t1 type 'sbt_tape';
     backup archivelog all delete input;
     release channel t1;
    }


3. A week later a disk crashes and the database file number 3 is lost:



In this example, it must be understood that only the DBFile3 has to be restored, without overwriting the datafiles that have not been affected by the crash, or a week’s work would be lost.
The tablespace containing the damaged file should be put offline with the SQLPLUS command:
  SQLPLUS / AS SYSDBA
  ALTER TABLESPACE this_tbs OFFLINE IMMEDIATE;

  rman target / catalog rman@MYCATALOG
  rman> run {
    allocate channel t1 type 'sbt_tape';
    restore tablespace this_tbs;
    release channel t1;
  }
If we are sure that only the datafile 3 has been damaged and not the rest of the tablespace this_tbs, we could use instead:
rman> run {
    allocate channel t1 type 'sbt_tape';
    restore datafile 3;
    release channel t1;
  }


It is now obvious that we have a mish-mash of contents, with tablespace up-to-date and a tablespace my_tbs with contents as of one week before. In this situation, Oracle will refuse to bring the tablespace online, probably with the error message 'datafile 'DBFIle3' needs to be recovered'. How is it possible to bring the tablespace my_tbs at the same stand as for the other tablespaces? It is possible because we have the archive logs backups, that contain all the changes to the database made between Mar 10th and Mar 17th.
The command needed is:

run {
     allocate channel t1 type 'sbt_tape';
     recover database;
     release channel t1;
    }
During the execution of this command, rman will restore the archived redo logs and apply them to the damaged tablespace. At the end of the process the tablespace can be brought online and the database is back to normal.          



Identifying Host Names and IP Addresses
This article presents a mixed bag of Oracle functionality relating to the identification of host names and IP addresses for Oracle clients and servers.
·    UTL_INADDR
·    SYS_CONTEXT
·    V$INSTANCE
·    V$SESSION
UTL_INADDR
The UTL_INADDR package was introduced in Oracle 8.1.6 to provide a means of retrieving host names and IP addresses of remote hosts from PL/SQL.

The GET_HOST_ADDRESS function returns the IP address of the specified host name.
SQL> SELECT UTL_INADDR.get_host_address('bart') FROM dual;

UTL_INADDR.GET_HOST_ADDRESS('BART')
--------------------------------------------------------------------------------
192.168.2.4

SQL>
The IP address of the database server is returned if the specified host name is NULL or is omitted.
SQL> SELECT UTL_INADDR.get_host_address from dual;

GET_HOST_ADDRESS
--------------------------------------------------------------------------------
192.168.2.5

SQL>
An error is returned if the specified host name is not recognized.
SQL> SELECT UTL_INADDR.get_host_address('banana') from dual;
SELECT UTL_INADDR.get_host_address('banana') from dual
       *
ERROR at line 1:
ORA-29257: host banana unknown
ORA-06512: at "SYS.UTL_INADDR", line 19
ORA-06512: at "SYS.UTL_INADDR", line 40
ORA-06512: at line 1


SQL>
The GET_HOST_NAME function returns the host name of the specified IP address.
SQL> SELECT UTL_INADDR.get_host_name('192.168.2.4') FROM dual;

UTL_INADDR.GET_HOST_NAME('192.168.2.4')
--------------------------------------------------------------------------------
bart

SQL>
The host name of the database server is returned if the specified IP address is NULL or omitted.
SQL> SELECT UTL_INADDR.get_host_name FROM dual;

GET_HOST_NAME
--------------------------------------------------------------------------------
C4210gR2

1 row selected.

SQL>
An error is returned if the specified IP address is not recognized.
SQL> SELECT UTL_INADDR.get_host_name('1.1.1.1') FROM dual;
SELECT UTL_INADDR.get_host_name('1.1.1.1') FROM dual
       *
ERROR at line 1:
ORA-29257: host 1.1.1.1 unknown
ORA-06512: at "SYS.UTL_INADDR", line 4
ORA-06512: at "SYS.UTL_INADDR", line 35
ORA-06512: at line 1


SQL>
SYS_CONTEXT
The SYS_CONTEXT function is able to return the following host and IP address information for the current session:
·    TERMINAL - An operating system identifier for the current session. This is often the client machine name.
·    HOST - The host name of the client machine.
·    IP_ADDRESS - The IP address of the client machine.
·    SERVER_HOST - The host name of the server running the database instance.
The following examples show the typical output for each variant.
SQL> SELECT SYS_CONTEXT('USERENV','TERMINAL') FROM dual;

SYS_CONTEXT('USERENV','TERMINAL')
--------------------------------------------------------------------
marge

1 row selected.

SQL> SELECT SYS_CONTEXT('USERENV','HOST') FROM dual;

SYS_CONTEXT('USERENV','HOST')
--------------------------------------------------------------------
marge

1 row selected.

SQL> SELECT SYS_CONTEXT('USERENV','IP_ADDRESS') FROM dual;

SYS_CONTEXT('USERENV','IP_ADDRESS')
--------------------------------------------------------------------
192.168.2.3

1 row selected.

SQL> SELECT SYS_CONTEXT('USERENV','SERVER_HOST') FROM dual;

SYS_CONTEXT('USERENV','SERVER_HOST')
--------------------------------------------------------------------
C4210gr2

1 row selected.

SQL>
V$INSTANCE
The HOST_NAME column of the V$INSTANCE view contains the host name of the server running the instance.
SQL> SELECT host_name FROM v$instance;

HOST_NAME
------------------------------------------------
C4210gR2

1 row selected.

SQL>
V$SESSION
The V$SESSION view contains the following host information for all database sessions:
·    TERMINAL - The operating system terminal name for the client. This is often set to the client machine name.
·    MACHINE - The operating system name for the client machine. This may include the domain name if present.
The following examples show the typical output for each column.
SQL> SELECT terminal, machine FROM v$session WHERE username = 'TIM_HALL';

TERMINAL                       MACHINE
------------------------------ ----------------------------------------------------
MARGE                          ORACLE-BASE\MARGE

1 row selected.


Oracle Flashback

Wrong DML: what to do when data is deleted by mistake.
In Oracle the rolling back a transaction is possible before the final commit or up to a savepoint, but the database offers another chance, which is probably little known, and it is the Flashback.

The essential requirement is that the database is using UNDO tablespaces and that the undo_retention is high enough. Generally speaking, rolling back the changes is only possible if the correction is made not later than the number of seconds specified by the parameter undo_retention.
SQL> show parameter undo

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     1800
undo_suppress_errors                 boolean     FALSE
undo_tablespace                      string      UNDOTBS1
In this example a table is created and populated

SQL> CREATE TABLE t_a (c1 number);

SQL> INSERT INTO t_a values (1);

SQL> INSERT INTO t_a values (2);

SQL> INSERT INTO t_a values (3);

1 row created.
SQL> INSERT INTO t_a values (4);
SQL> INSERT INTO t_a values (5);

1 row created.

SQL> COMMIT;

Commit complete.

SQL> ALTER SESSION SET NLS_DATE_FORMAT ='hh24:mi:ss dd-mon-yyyy';

Session altered.

SQL> SELECT SYSDATE FROM DUAL;

SYSDATE
--------------------
12:53:43 31-dec-2004


SQL> DELETE t_a;

5 rows deleted

SQL> COMMIT
Commit completed


SQL> SELECT COUNT(*) FROM t_a;

  COUNT(*)
----------
         0

The deletion was a blunder and the problem is realized 20 minutes later; since the undo_retention is 1800 seconds (30 minutes) it is probably still possible to rescue the data using the close "AS OF TIMESTAMP" and specifying a time interval in the past.

SQL> SELECT SYSDATE FROM DUAL;

SYSDATE
--------------------
13:11:20 31-dec-2004

SQL> SELECT * FROM t_a
  2  AS OF TIMESTAMP (sysdate - interval '30' minute);

        C1
----------
         1
         2
         3
         4 
         5

By specifying SELECT AS OF TIMESTAMP we are still able to access the records that have been deleted, even if the table is now empty.

SQL> SELECT COUNT(*) FROM t_a;

  COUNT(*)
----------
         0

SQL> CREATE TABLE t_b AS (SELECT * FROM t_a AS OF TIMESTAMP (SYSDATE - INTERVAL '30' MINUTE);

Table created.


SQL> SELECT COUNT(*) FROM t_b;

  COUNT(*)
----------
         5

SQL> SELECT * FROM t_b;

        C1
----------
         1
         2
         3
         4
         5

It was therefore possible to restore the contents as they were 20 minutes before. If the table is dropped, unfortunately no rollback is possible.
SQL>
SQL> DROP TABLE t_a;

Table dropped.

SQL>  CREATE TABLE t_c AS (SELECT * FROM t_a AS OF TIMESTAMP (SYSDATE - INTERVAL '40' MINUTE));
 CREATE TABLE t_c AS (SELECT * FROM t_a AS OF TIMESTAMP (SYSDATE - INTERVAL '40' MINUTE))
                                    *
ERROR at line 1:
ORA-00942: table or view does not exist
The Oracle Flashback is therefore one of the ways of restoring deleted or wrongly updated records; another tool is the Oracle LogMiner, which offers a graphical interface.


Understanding checkpoints, system change number SCN and recovery


How does Oracle know that a database needs recovering? What checks does it carry out when a database is opened? This article explains a couple of common cases of Oracle recovery and the role of the system change number SCN in the recovery.
What is the SCN system change number?

Every modification in the database (for example, an update) is given an unique system change number SCN, which is, therefore, an ever increasing integer. The latest SCN is immediately recorded in the controlfile.

The modifications are logged in the online redo log by the log writer LGWR, but not necessarily in the datafiles. Infact, there are times when the modified data is still in memory in a block that has become "dirty", meaning that it no longer contains the same data as the block in the datafile.

There are moments, though, when all modifications are written onto all the datafiles: this happens by a checkpoint, shutdown of the database, when a redo log is switched.
At that point the header of the datafiles is updated with the SCN of the system, which will be the same as recorded in the controlfile. The Oracle process CKPT (the checkpoint process) is responsible for this update. When the checkpoint is completed the database reaches a "consistent" state, meaning that it is clean, there is no contraddiction between the SCN and timestamps of the various components.
The V$DATAFILE_HEADER dynamic view
For the purpose of our discussion, it is important to familiarise ourselves with the dynamic view V$DATAFILE_HEADER, which contains the information that Oracle writes in the datafile headers.

SQL> descr V$DATAFILE_HEADER
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 FILE#                                              NUMBER
 STATUS                                             VARCHAR2(7)
 ERROR                                              VARCHAR2(18)
 FORMAT                                             NUMBER
 RECOVER                                            VARCHAR2(3)
 FUZZY                                              VARCHAR2(3)
 CREATION_CHANGE#                                   NUMBER
 CREATION_TIME                                      DATE
 TABLESPACE_NAME                                    VARCHAR2(30)
 TS#                                                NUMBER
 RFILE#                                             NUMBER
 RESETLOGS_CHANGE#                                  NUMBER
 RESETLOGS_TIME                                     DATE
 CHECKPOINT_CHANGE#                                 NUMBER
 CHECKPOINT_TIME                                    DATE
 CHECKPOINT_COUNT                                   NUMBER
 BYTES                                              NUMBER
 BLOCKS                                             NUMBER
 NAME                                               VARCHAR2(513)
SQL> column CHECKPOINT_CHANGE#  format 99999999999999
SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER;

     FILE# CHECKPOINT_TIME      CHECKPOINT_CHANGE#
---------- -------------------- ------------------
         1 13:47:58 02-AUG-2004            6488359
         2 13:47:58 02-AUG-2004            6488359
         3 13:47:58 02-AUG-2004            6488359
         4 13:47:58 02-AUG-2004            6488359
         5 13:47:58 02-AUG-2004            6488359
         6 13:47:58 02-AUG-2004            6488359
         8 13:47:58 02-AUG-2004            6488359
         9 13:47:58 02-AUG-2004            6488359
8 rows selected.
The view v$database contains information that comes from the controlfile.
SQL> descr v$database
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 DBID                                               NUMBER
 NAME                                               VARCHAR2(9)
 CREATED                                            DATE
 RESETLOGS_CHANGE#                                  NUMBER
 RESETLOGS_TIME                                     DATE
 PRIOR_RESETLOGS_CHANGE#                            NUMBER
 PRIOR_RESETLOGS_TIME                               DATE
 LOG_MODE                                           VARCHAR2(12)
 CHECKPOINT_CHANGE#                                 NUMBER
 ARCHIVE_CHANGE#                                    NUMBER
 CONTROLFILE_TYPE                                   VARCHAR2(7)
 CONTROLFILE_CREATED                                DATE
 CONTROLFILE_SEQUENCE#                              NUMBER
 CONTROLFILE_CHANGE#                                NUMBER
 CONTROLFILE_TIME                                   DATE
 OPEN_RESETLOGS                                     VARCHAR2(11)
 VERSION_TIME                                       DATE
 OPEN_MODE                                          VARCHAR2(10)
 PROTECTION_MODE                                    VARCHAR2(20)
 PROTECTION_LEVEL                                   VARCHAR2(20)
 REMOTE_ARCHIVE                                     VARCHAR2(8)
 ACTIVATION#                                        NUMBER
 DATABASE_ROLE                                      VARCHAR2(16)
 ARCHIVELOG_CHANGE#                                 NUMBER
 SWITCHOVER_STATUS                                  VARCHAR2(18)
 DATAGUARD_BROKER                                   VARCHAR2(8)
 GUARD_STATUS                                       VARCHAR2(7)
 SUPPLEMENTAL_LOG_DATA_MIN                          VARCHAR2(3)
 SUPPLEMENTAL_LOG_DATA_PK                           VARCHAR2(3)
 SUPPLEMENTAL_LOG_DATA_UI                           VARCHAR2(3)
 FORCE_LOGGING                                      VARCHAR2(3)

SQL> select CONTROLFILE_CHANGE# from v$database;

CONTROLFILE_CHANGE#
-------------------
            6488361

SQL> select CHECKPOINT_CHANGE# from v$database;

CHECKPOINT_CHANGE#
------------------
           6488359
We notice that the CHECKPOINT_CHANGE# in the datafile headers is identical to the CHECKPOINT_CHANGE# in the controlfile(s) because when the queries were executed, the database had just been opened..


1 case: datafile that needs recovery
To understand how things work, let's make a couple of experiments. Warning: these are operation that should be avoided in real life. Their sole purpose here is to explain the concept, they are not recommended procedures.

Being on UNIX, it is possible to make copies of datafiles and controlfiles when the database is open. Since we don't want to break anything, our database is in archive mode, we have already taken a good rman backup and we are therefore ready to restore if something goes wrong.
These are the steps of our exercise to simulate the restore and recovery of a datafile:
1.    A copy of one of the datafiles is taken.
2.    Normal activity continues.
3.    The database is closed.
4.    The copy taken at step 1 is put back onto its original place; this file is therefore "older" than the others (even if the timestamp might be more recent)
5.    The database is open
Let's have a look at our test database:
  1* select file#, name from v$datafile

     FILE# NAME
---------- --------------------------------------------------------------------------------
         1 /u05/oradata/DEVDB/system01DEVDB.dbf
         2 /u04/oradata/DEVDB/rbs01DEVDB.dbf
         3 /u05/oradata/DEVDB/temp01DEVDB.dbf
         4 /u04/oradata/DEVDB/tools01DEVDB.dbf
         5 /u05/oradata/DEVDB/users01DEVDB.dbf
         6 /u05/oradata/DEVDB/data01DEVDB.dbf
         8 /u04/oradata/DEVDB/appidx01DEVDB.dbf
         9 /u05/oradata/DEVDB/appdata01DEVDB.dbf

8 rows selected.
We create a table t_test before copying the datafile and a table t_test2 after copying it:
SQL> create table t_test(c1 number) tablespace tools;

Table created.

SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER;

     FILE# CHECKPOINT_TIME      CHECKPOINT_CHANGE#
---------- -------------------- ------------------
         1 13:47:58 02-AUG-2004            6488359
         2 13:47:58 02-AUG-2004            6488359
         3 13:47:58 02-AUG-2004            6488359
         4 13:47:58 02-AUG-2004            6488359
         5 13:47:58 02-AUG-2004            6488359
         6 13:47:58 02-AUG-2004            6488359
         8 13:47:58 02-AUG-2004            6488359
         9 13:47:58 02-AUG-2004            6488359

8 rows selected.

SQL> insert into t_test values(1);

1 row created.

SQL> /

1 row created.

SQL> commit;

Commit complete.

SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER;

     FILE# CHECKPOINT_TIME      CHECKPOINT_CHANGE#
---------- -------------------- ------------------
         1 13:47:58 02-AUG-2004            6488359
         2 13:47:58 02-AUG-2004            6488359
         3 13:47:58 02-AUG-2004            6488359
         4 13:47:58 02-AUG-2004            6488359
         5 13:47:58 02-AUG-2004            6488359
         6 13:47:58 02-AUG-2004            6488359
         8 13:47:58 02-AUG-2004            6488359
         9 13:47:58 02-AUG-2004            6488359

8 rows selected.
The SCN hasn't changed yet. We now make the copy
SQL> !cp /u04/oradata/DEVDB/tools01DEVDB.dbf /tmp

SQL> alter system switch logfile;

System altered.

SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER;

     FILE# CHECKPOINT_TIME      CHECKPOINT_CHANGE#
---------- -------------------- ------------------
         1 13:57:20 02-AUG-2004            6488566
         2 13:57:20 02-AUG-2004            6488566
         3 13:57:20 02-AUG-2004            6488566
         4 13:57:20 02-AUG-2004            6488566
         5 13:57:20 02-AUG-2004            6488566
         6 13:57:20 02-AUG-2004            6488566
         8 13:57:20 02-AUG-2004            6488566
         9 13:57:20 02-AUG-2004            6488566

8 rows selected.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
At this point as a safety measure we take a copy /u04/oradata/DEVDB/tools01DEVDB.dbf.save of the datafile 4 as it is now and copy back from /tmp:
oracle-localora@# cp /u04/oradata/DEVDB/tools01DEVDB.dbf /u04/oradata/DEVDB/tools01DEVDB.dbf.save

oracle-localora@# cp /tmp/tools01DEVDB.dbf /u04/oradata/DEVDB/tools01DEVDB.dbf

cp: overwrite /u04/oradata/DEVDB/tools01DEVDB.dbf (yes/no)? y
The database can now be opened: let's see what happens:
SQL> startup
ORACLE instance started.

Total System Global Area  189409824 bytes
Fixed Size                   731680 bytes
Variable Size             104857600 bytes
Database Buffers           81920000 bytes
Redo Buffers                1900544 bytes
Database mounted.
ORA-01113: file 4 needs media recovery
ORA-01110: data file 4: '/u04/oradata/DEVDB/tools01DEVDB.dbf'
So, why does Oracle (rightly) complain about file 4? Because the SCN in its header is different from the others:
SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER;

     FILE# CHECKPOINT_TIME      CHECKPOINT_CHANGE#
---------- -------------------- ------------------
         1 13:57:57 02-AUG-2004            6488582
         2 13:57:57 02-AUG-2004            6488582
         3 13:57:57 02-AUG-2004            6488582
         4 13:47:58 02-AUG-2004            6488359
         5 13:57:57 02-AUG-2004            6488582
         6 13:57:57 02-AUG-2004            6488582
         8 13:57:57 02-AUG-2004            6488582
         9 13:57:57 02-AUG-2004            6488582
The datafile 4 has a "delay", since its SCN number is smaller than the one in the other datafiles and controlfile.
We can also find out which archived (or online) redo logs will be necessary for the recovery: they are the ones (or one) whose FIRST_CHANGE# is less or equal 6488359 AND next_change is greater
SQL> select NAME, FIRST_CHANGE# from v$archived_log where FIRST_CHANGE# >= 6488359;

NAME                                        FIRST_CHANGE#
--------------------------------------------------------------------------------
/opt/oracle/admin/DEVDB/arch/1_60.dbf   6488563

SQL> RECOVER DATABASE;

ORA-00279: change 6488359 generated at 08/02/2004 13:47:58 needed for thread 1
ORA-00289: suggestion : /opt/oracle/admin/DEVDB/arch/1_60.dbf
ORA-00280: change 6488359 for thread 1 is in sequence #60


Specify log: {=suggested | filename | AUTO | CANCEL}  [we press return]
Log applied.
Media recovery complete. 

SQL>ALTER DATABASE OPEN

The database is back to normal.
2. When is command "recover database using backup controlfile" required?
Let's take a copy of the controlfile before any DDL activity, such as creating a new table and populating it. This will simulate the restore of a controlfile:
SQL> !cp /u01/oradata/DEVDB/ctl1DEVDB.dbf /tmp

SQL> create table t_test2(c1 number) tablespace tools;
Table created.

SQL> insert into t_test2   values(10);
1 row created.

SQL> insert into t_test2   values(11);
1 row created.

SQL> insert into t_test2   values(12);
1 row created.

SQL> commit;

Commit complete.
We now shutdown the database; this will cause a checkpoint to happen, meaning that all datafiles headers will be updated with the SCN contained in the controlfile.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
Before any dangerous experiment, the wise DBA will always takes a copy of the controlfile(s)
oracle-localora@# cp /u01/oradata/DEVDB/ctl1DEVDB.dbf /u01/oradata/DEVDB/ctl1DEVDB.dbf.save
oracle-localora@# cp /u02/oradata/DEVDB/ctl2DEVDB.dbf /u02/oradata/DEVDB/ctl2DEVDB.dbf.save
oracle-localora@# cp /u03/oradata/DEVDB/ctl3DEVDB.dbf /u03/oradata/DEVDB/ctl3DEVDB.dbf.save
Let's now overwrite the controlfiles with the copy we created on the /tmp directory:
oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u01/oradata/DEVDB/ctl1DEVDB.dbf
cp: overwrite /u01/oradata/DEVDB/ctl1DEVDB.dbf (yes/no)? y
oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u02/oradata/DEVDB/ctl2DEVDB.dbf
cp: overwrite /u02/oradata/DEVDB/ctl2DEVDB.dbf (yes/no)? y
oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u03/oradata/DEVDB/ctl3DEVDB.dbf
cp: overwrite /u03/oradata/DEVDB/ctl3DEVDB.dbf (yes/no)? y
oracle-localora@#
Let's try to open the database.
SQL> startup
ORACLE instance started.

Total System Global Area  189409824 bytes
Fixed Size                   731680 bytes
Variable Size             104857600 bytes
Database Buffers           81920000 bytes
Redo Buffers                1900544 bytes
Database mounted.
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/u05/oradata/DEVDB/system01DEVDB.dbf'
ORA-01207: file is more recent than controlfile - old controlfile

SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER;

     FILE# CHECKPOINT_TIME      CHECKPOINT_CHANGE#
---------- -------------------- ------------------
         1 14:20:22 02-AUG-2004            6488634
         2 14:20:22 02-AUG-2004            6488634
         3 14:20:22 02-AUG-2004            6488634
         4 14:20:22 02-AUG-2004            6488634
         5 14:20:22 02-AUG-2004            6488634
         6 14:20:22 02-AUG-2004            6488634
         8 14:20:22 02-AUG-2004            6488634
         9 14:20:22 02-AUG-2004            6488634

8 rows selected.

SQL> select CHECKPOINT_CHANGE# from v$database;

CHECKPOINT_CHANGE#
------------------
           6488582
Oracle complains saying the the controlfile is an old copy judging from the SCN; during other kinds of recovery you will be prompted that the command "recover database using backup controlfile" is needed essentially for the same reason. Let's see what happens if we try a normal recovery:
SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/u05/oradata/DEVDB/system01DEVDB.dbf'
ORA-01207: file is more recent than controlfile - old controlfile

Instead we use:

SQL> recover database using backup controlfile;
ORA-00279: change 6488582 generated at 08/02/2004 13:57:20 needed for thread 1
ORA-00289: suggestion : /opt/oracle/admin/DEVDB/arch/1_61.dbf
ORA-00280: change 6488582 for thread 1 is in sequence #61


Specify log: {=suggested | filename | AUTO | CANCEL}

ORA-00308: cannot open archived log '/opt/oracle/admin/DEVDB/arch/1_61.dbf'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
This is an interesting case because the redo log needed for the recovery hasn't been archived yet and it is one of the online redo logs. We can see it because the latest archived log is sequence 60, but Oracle is looking for the sequence 61.
SQL> !ls -ltr /opt/oracle/admin/DEVDB/arch
total 10298
-rw-r-----   1 oracle   dba        18944 Aug  2 13:57 1_59.dbf
-rw-r-----   1 oracle   dba         1536 Aug  2 13:57 1_60.dbf 
SQL> select member from v$logfile;

MEMBER
--------------------------------------------------------------------------------
/u03/oradata/DEVDB/redog2m1DEVDB.dbf
/u04/oradata/DEVDB/redog2m2DEVDB.dbf
/u01/oradata/DEVDB/redog1m1DEVDB.dbf
/u02/oradata/DEVDB/redog1m2DEVDB.dbf

SQL> recover database using backup controlfile;
ORA-00279: change 6488582 generated at 08/02/2004 13:57:20 needed for thread 1
ORA-00289: suggestion : /opt/oracle/admin/DEVDB/arch/1_61.dbf
ORA-00280: change 6488582 for thread 1 is in sequence #61

Specify log: {=suggested | filename | AUTO | CANCEL} [we type the name of the online redo log]
/u03/oradata/DEVDB/redog2m1DEVDB.dbf
Log applied.
Media recovery complete.
We have now almost finished; the final remark is that when recovering with a backup controlfile, the database must be opened with the clause "resetlogs"
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open

SQL> alter database open noresetlogs;
alter database open noresetlogs
*
ERROR at line 1:
ORA-01588: must use RESETLOGS option for database open

SQL> alter database open resetlogs;
Database altered.

SQL> select * from t_test2;

        C1
----------
        10
        11
        12

SQL> select * from t_test;

        C1
----------
         1
         1
The contents of the database are as expected.



Error message: port already in use
Sometimes it happens that an installation fails with the error message "port already in use" (this can be the case with the Oracle snmpx daemon on UNIX systems).
If you don't know which TCP port is generating the error, the following UNIX command lists those currently being used.

oracle@:PROD1 >netstat -an | grep LISTEN | sort -k 1
      *.111                *.*                0      0 24576      0 LISTEN
      *.13                 *.*                0      0 24576      0 LISTEN
      *.161                *.*                0      0 24576      0 LISTEN
      *.19                 *.*                0      0 24576      0 LISTEN
      *.2049               *.*                0      0 24576      0 LISTEN
      *.21                 *.*                0      0 24576      0 LISTEN
      *.23                 *.*                0      0 24576      0 LISTEN
      *.2410               *.*                0      0 24576      0 LISTEN
... ...
      *.898                *.*                0      0 24576      0 LISTEN
      *.9                  *.*                0      0 24576      0 LISTEN
      *.9991               *.*                0      0 24576      0 LISTEN
179.146.111.76.1521        *.*                0      0 24576      0 LISTEN

This list shows the port 21 (ftp), 23 (telnet), 1521 (an Oracle listener) etc. If you suspect of a particular TCP port (for example 161) go for it.

netstat -an | grep 161

      *.161                *.*                0      0 24576      0 LISTEN


Find the process that is using the TCP port
It is now necessary to find out which process is actually using the TCP port. On the Internet an open source tool called lsof is available, but if you don't feel like installing another tool, the following short script can do the work.
The script goes through the process tables and finds out which one is already using a certain port. You need root privileges to run this script.

for PROC in /proc/*
do
  echo $PROC
  pfiles -F $PROC | grep port |grep 161
done

A sample output is

..
/proc/76
/proc/777
        sockname: AF_INET 0.0.0.0  port: 161
        sockname: AF_INET 179.146.111.98   port: 161
        sockname: AF_INET 179.146.111.101  port: 161
        sockname: AF_INET 179.146.111.102  port: 161
        sockname: AF_INET 179.146.111.100  port: 161
/proc/779
/proc/784
...

We have now got the process ID 777 and it is easy to determine what it is doing :

app@serverdaddy[on pts/1] ps -ef | grep 777
    root   777  717  0   Dec 05 ? 11:28 /opt/buw/bin/snmpd -f -x -s g_mon -l

Our investigation is therefore completed: the TCP port 161 is being used by the process /opt/buw/bin/snmpd.
A real case: Could not start agent. Initialization failure
Sometimes it happens, especially after an updrade, to come across the error:

Failed while initializing Collection Service
Error initializing subsystems
Agent exited at Thu Jan 27 11:00:54 MET 2005 with return value 55
Could not start agent. Initialization failure

oracle@zhru09 [oracle] # pwd
/opt/oracle/9.2.0.5/network/log
oracle@dbserver [oracle] # ls -ltr
total 6
-rw-rw----   1 oracle   dba          278 Jan 27 11:00 nmiconf.log
-rw-r-----   1 oracle   dba          267 Jan 27 11:00 dbsnmp.nohup
-rw-rw----   1 oracle   dba          585 Jan 27 11:00 dbsnmp.log
The contents of dbsnmp.nohup are:

oracle@dbserver [oracle] # cat dbsnmp.nohup

------------------------ Thu Jan 27 11:00:49 MET 2005 ------------------------
Failed while initializing Collection Service
Error initializing subsystems
Agent exited at Thu Jan 27 11:00:54 MET 2005 with return value 55
Could not start agent. Initialization failure
The contents of dbsnmp.log are:
oracle@dbserver [oracle] # cat dbsnmp.log

DBSNMP for Solaris: Version 9.2.0.5.0 - Production on 27-JAN-2005 11:00:52

Copyright (c) 2003 Oracle Corporation.  All rights reserved.

System parameter file is /var/opt/oracle/snmp_ro.ora
Log messages written to /opt/oracle/9.2.0.5/network/log/dbsnmp.log
Trace information written to /opt/oracle/9.2.0.5/network/trace/dbsnmp_6038.trc
Trace level is currently 0

NMS-00001: Warning: dbsnmp unable to connect to SNMP master agent
11:00:54 27/01/2005
ODG-05019: Error: failed to bind socket
11:00:54 27/01/2005
ODG-05083: Error: failed to allocate listening port/socket
The explanation in the Oracle Metalink for the error ODG-05083 is:

Error:    ODG-5083 Text:    Error: failed to allocate listening port/socket
  ---------------------------------------------------------------------------

Cause:    an error occurred while attempting to open the Data Gatherer socket
    
(1808). 
Action:   
ensure that no other process is using this socket and retry the operation
Therefore, withe script above, let's check whether some other process is already using the port 1808:


sysuser@dbserver[on pts/5] netstat -a |egrep -i 1808
dbserver.1808                *.*                0      0 24576      0 LISTEN

for PROC in /proc/*
do
  echo $PROC
  pfiles -F $PROC | grep port |grep 1808
done

...
/proc/1162
/proc/1165
/proc/3687
/proc/3689
/proc/3691
/proc/4429
        sockname: AF_INET 169.166.228.84  port: 1808
/proc/5632
... ...

root@dbserver[on pts/7]# ps -aef|grep -i 4429
    root  7961  6922  0 11:36:15 pts/7    0:00 grep -i 4429
  oracle  4429     1  0   Dec 09 ?       43:56 /u00/app/oracle/product/8.1.7/bin/vppdc
root@dbserver[on pts/7]# kill -3 4429
root@dbserver[on pts/7]# kill -5 4429
root@dbserver[on pts/7]# kill -9 4429
The problem, therefore, was that the data gatherer of version 8.1.7 was still running. The reason why sometimes we forget this process is that the agent in 9i is also the data gatherer.
Again, this is a real case.


What is a cluster?
This article gives some information about intalling and using the Oracle Real Application Cluster RAC. A definition of a cluster is : "A group of computers linked to provide fast and reliable service. Cluster technologies have evolved over the past 15 years to provide servers that are both highly available and scalable. Clustering is one of several approaches that exploits parallel processing — the use of multiple subsystems as building blocks".

Clusters have existed for quite a long time; the first solution was offered by DEC for its VMS operating system. Clusters on UNIX are more recent.

Clustering is therefore a solution for having different computers share common devices and jointly perform tasks; they exist with or without databases.
Several companies offer clustering software: Veritas, Sun, IBM, Microsoft, MySql etc. For example, the version for Windows NT can be downloaded from real application cluster for NT/2000 This means that the set of commands for managing the cluster itself depends on the specific product.
Shared storage
A cluster will share some storage. There will be some software in place to manage this storage, which could be the Veritas Volume Manager, or Sun Soltice, or UNIX itself if you are using raw devices. Even if it is likely that the clustering software and the storage manager will be provided by the same vendor, it is not necessarily so.
What is an Oracle Real Application Cluster?
The Oracle RAC is a solution offered by Oracle Corporation in order to build a shared database using the clustering technology; RAC and clustering are not the same thing: RAC comes on top of the clustering software, even if the two softwares are tightly interconnected.



Real Application Clusters requires that all each instance be able to access a set of devices on a shared disk subsystem. Veritas Storage Foundation (VSF) for RAC supports an Oracle-certified Cluster File System (CFS), allowing to store the files that Real Application Clusters requires directly on the cluster file system. A raw disk subsystem can be used instead if required.

Veritas states that the I/O performance of the VSF matches that of the raw devices. The CFS allows the use of normal Oracle commands such as "alter datafile resize 500M". With raw devices this is not possible, because the system manager must create the device with a certain size using the Volume Manager. Analogously, the archived redo logs can be created on the CFS.

Oracle Cluster can be installed also on a single node, but the cluster management must be installed in advance and running. The tool on WindowsNT/2000 and Linux is called oracm and is shipped with Oracle.
How many copies of Oracle binaries in a RAC?
The Oracle binaries can be installed once on the shared storage or on each server (for example, on the system disk). Both ways are possible and your decision should be consider few factors: easy of use and possible downtime.
·    One copy of the binaries introduces a single point of failure.
·    One copy of the binaries is more manageable; for example, a patch has to be applied only once and not on each server.
·    If you keep only one copy, than you have a certain outage if you have to apply a patch, because all the instances will be to be brought down. On the contrary, with binaries on each node only the instance on that particular node will have to be stopped.
Defining the instances on the two nodes of the Oracle cluster
If the cluster software is running, the installation with the Oracle Universal Installer is very simple. The installer will detect the two nodes and will ask on which node the software should be copied; if you are using a shared device for the $ORACLE_HOME the node doesn't matter.
When the installation is completed, you can try to start the instance on one of the two nodes without mounting the database, which of course hasn't yet been created.
The tool is the usual sqlplus, even if the new tool svrctl can be used.
RAC configuration files
How many copies of init.ora should you keep? Where to define TNS_ADMIN? The new cluster systax introduced a "dot" notation in the init.ora file to specify parameters for different instances. This allows to keep only one copy of init.ora for all instances.

To fix the ideas, let's say that the cluster nodes are called sercluster1 and sercluster2, the database name is EUROPE and that the instances names are EU1, EU2 etc. It the shared mountpoint is /orasoft, the ORACLE_BASE=/orasof/oracle, ORACLE_HOME=$ORACLE_BASE/9.2.0 One possible way to proceed is creating a tree $ORACLE_BASE/admin/EUROPE/... on the shared device. In particular, we will define an initEUROPE.ora on $ORACLE_BASE/admin/EUROPE/pfile
The soft links must now be created:

ln -s $ORACLE_BASE/admin/EUROPE/pfile/initEUROPE.ora $ORACLE_HOME/dbs/initEU1.ora
ln -s $ORACLE_BASE/admin/EUROPE/pfile/initEUROPE.ora $ORACLE_HOME/dbs/initEU2.ora
To set correctly the environment on the two nodes, a possible script would be
#!ksh
export ORACLE_BASE=/orasoft/oracle
export ORACLE_HOME=$ORACLE_BASE/9.2.0
myHost=`hostname`
if [ $myHost = "sercluster1" ]; then
  ORACLE_SID=EU1
else
  ORACLE_SID=EU2
fi
export ORACLE_SID
... ... ...
The immediate question is: how can you use the same init.ora if the instance names are different? The answer is that in a RAC environmnet it is now possible to specify a parameter with a "dot" notation. For example:

cluster_database=true  
cluster_database_instances=2
#
EU1.instance_name=EU1
EU1.instance_number=1
EU1.thread=1
EU1.undo_tablespace=UNDOEU1

EU2.instance_name=EU2
EU2.instance_number=2
EU2.thread=2
EU2.undo_tablespace=UNDOEU2

undo_management=auto

Creating the RAC database
At this stage $ORACLE_HOME and $ORACLE_SID must be defined in the environment and the init.ora file must be found on $ORACLE_HOME/dbs.

As we said before, on UNIX (Linux) the init.ora on $ORACLE_HOME/dbs is typically a soft link to $ORACLE_BASE/admin/$ORACLE_SID/pfile/init${ORACLE_SID}.ora

For this operation, the parameter cluster_database should be set to FALSE; in fact, the first part is not different from creating a database with a single instance and this setting avoids some unnecessary complexity at this stage.

cluster_database = FALSE

The database can now be created using normal scripts.

sqlplus '/ as sysdba'
sql> startup nomount
spool /orasoft/oracle/admin/EUROPE/create/logs/crDB_EUROPE_1.log
startup nomount pfile ='/orasoft/oracle/admin/EUROPE/pfile/initEUROPE.ora'
CREATE DATABASE EUROPE
   controlfile REUSE
   maxinstances 4
   maxlogfiles 64
   maxlogmembers 4
   maxdatafiles 1000
   maxloghistory 1000
   character set AL32UTF8
   national character set AL16UTF16
DATAFILE '/u00/oradata/EUROPE/EUROPE_system01.dbf' SIZE 1000M REUSE
EXTENT MANAGEMENT LOCAL
UNDO TABLESPACE UNDOEU1 DATAFILE
    '/u09/oradata/EUROPE/EUROPE_undo_T1_01.dbf' SIZE 5000M REUSE
DEFAULT TEMPORARY TABLESPACE TEMP TEMPFILE
    '/u08/oradata/EUROPE/EUROPE_temp01.dbf' SIZE 5000M REUSE
    EXTENT MANAGEMENT LOCAL UNIFORM SIZE 50M   
LOGFILE GROUP 1 ('/u05/oradata/EUROPE/EUROPE_redo1_1T1.dbf',

'/u06/oradata/EUROPE/EUROPE_redo1_2T1.dbf') SIZE 100M REUSE,
        GROUP 2 ('/u06/oradata/EUROPE/EUROPE_redo2_1T1.dbf',

        '/u07/oradata/EUROPE/EUROPE_redo2_2T1.dbf') SIZE 100M REUSE,
        GROUP 3 ('/u07/oradata/EUROPE/EUROPE_redo3_1T1.dbf',

        '/u05/oradata/EUROPE/EUROPE_redo3_2T1.dbf') SIZE 100M REUSE,
        GROUP 4 ('/u05/oradata/EUROPE/EUROPE_redo4_1T1.dbf',

        '/u06/oradata/EUROPE/EUROPE_redo4_2T1.dbf') SIZE 100M REUSE;
       
spool off

The second instance can start, but cannot mount nor open the database. Before this is possible, the redo logs and the UNDO tablespace for the second instance must be created

ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 5
        ('/u04/oradata/EUROPE/EUROPE_redo5_1T2.dbf',

        '/u05/oradata/EUROPE/EUROPE_redo5_2T2.dbf') SIZE 100M;
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 6
        ('/u05/oradata/EUROPE/EUROPE_redo6_1T2.dbf',

        '/u06/oradata/EUROPE/EUROPE_redo6_2T2.dbf') SIZE 100M;
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 7
        ('/u06/oradata/EUROPE/EUROPE_redo7_1T2.dbf',

        '/u07/oradata/EUROPE/EUROPE_redo7_2T2.dbf') SIZE 100M;
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 8
        ('/u07/oradata/EUROPE/EUROPE_redo8_1T2.dbf',

        '/u08/oradata/EUROPE/EUROPE_redo8_2T2.dbf') SIZE 100M;

ALTER DATABASE ENABLE THREAD 2;


CREATE UNDO TABLESPACE UNDOEU2 DATAFILE
    '/u07/oradata/EUROPE/EUROPE_undo_T2_01.dbf' SIZE 5000M REUSE;

Shut down the two instances Now set cluster_database = TRUE

# LISTENER.ORA Network Configuration File: /optware/oracle/network/admin/listener.ora
# Generated by Oracle configuration tools.

LISTENER_EU1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002))
  )

SID_LIST_LISTENER_EU1 =
  (SID_LIST =
    (SID_DESC =
      (ORACLE_HOME = /optware/oracle/9.2.0)
      (SID_NAME = EU1)
    )
  )

LISTENER_EU2 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster2)(PORT = 2002))
  )

SID_LIST_LISTENER_EU2 =
  (SID_LIST =
    (SID_DESC =
      (ORACLE_HOME = /orasoft/oracle/9.2.0)
      (SID_NAME = EU2)
    )
  )


# TNSNAMES.ORA Network Configuration File: /orasoft/oracle/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

EUROPE =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002))
      (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002))
      (LOAD_BALANCE = yes)
    )
    (CONNECT_DATA =
      (SERVICE_NAME = EUROPE)
    )
  )


No comments:

Post a Comment