An Introduction to Rman With Examples

March 15, 2017 | Author: Shahid Mahmud | Category: N/A
Share Embed Donate


Short Description

An Introduction to Rman With Examples...

Description

An introduction to rman with examples. For beginners and non, rman is one of the most obscure Oracle tools, but in reality it is based on quite simple principles. It is true, though, that a restoration can be distructive for a database.

Why is rman dangerous? rman is dangerous because it works at the database physical level; to understand why, let's say that rman is not "too" different from WinZip. Since everybody knows WinZip, it will probably be easier to understand rman.

Is WinZip dangerous? The creation of a zip archive is not dangerous, unless you choose the same name for the output file, or the archive is so big that you fill a device or the temporary directory etc. The same thing can be said about rman: taking an rman backup cannot be dangerous (unless you stop the database, or close it intentionally) and you don't overwrite previous backups (you can prevent it). rman creates a compressed backup of the physical database files, including controlfiles, datafiles, archived logs and stores them somewhere. This somewhere can be a disk (like WinZip) or a tape. Therefore, the first important point to remember is: rman creates compressed backups of the physical database; if you prefer, rman zips the database, entirely or not.

Is Unzip dangerous? Unzip, on the other hand, can be very dangerous. The risk is overwriting files that should not be overwritten and, therefore, losing important content. The very same applies to rman restores: since they overwrite database files, you should know whether this is the right decision because, otherwise, you may lose your data. This is the main difference between an import and an rman restoration: an rman restoration (restore) overwrites one or more datafiles; this means WATCH OUT.

Our first rman backup The only requirements for taking simple, useful rman backups are two:

• •

You database is in archivelog mode You have enough disk space somewhere

On our Windows 2000 workstation we open a CMD window and type the following few lines:

E:\>rman target /

Recovery Manager: Release 9.2.0.1.0 - Production Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved. connected to target database: ARK9201 (DBID=3564144589)

run { allocate channel t1 type disk; backup database format 'E:\RMAN_BACKUPS\ARK9201\%d_%u_%s'; release channel t1; } using target database controlfile instead of recovery catalog allocated channel: t1 channel t1: sid=9 devtype=DISK Starting backup at 10-DEC-04 channel t1: starting full datafile backupset channel t1: specifying datafile(s) in backupset including current SPFILE in backupset including current controlfile in backupset released channel: t1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of backup command on t1 channel at 12/10/2004 14:28:50 ORA-19602: cannot backup or copy active file in NOARCHIVELOG mode As we said, the database must be in archive mode if we want to backup open databases. A separate articles explains why it must be so, because it is not so obvious.

SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. SQL> startup mount ORACLE instance started. Total System Global Area 135338868 bytes Fixed Size 453492 bytes Variable Size 109051904 bytes Database Buffers 25165824 bytes Redo Buffers 667648 bytes Database mounted. SQL> alter database archivelog; Database altered. SQL> alter database open; Database altered.

Let's now repeat the operation having put the database in archive mode Choosing C:\Temp as directory for the backup is, of course, a very bad idea. Backups are gold and shoud be kept on a dedicated device; a good name could be E:\RMAN_BACKUPS\SID

E:\>rman target / Recovery Manager: Release 9.2.0.1.0 - Production Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved. connected to target database: ARK9201 (DBID=3564144589) RMAN> run {

2> 3> 4> 5>

allocate channel t1 type disk; backup database format 'E:\RMAN_BACKUPS\ARK9201\%d_%u_%s'; release channel t1; }

allocated channel: t1 channel t1: sid=12 devtype=DISK Starting backup at 26-DEC-04 channel t1: starting full datafile backupset channel t1: specifying datafile(s) in backupset including current SPFILE in backupset including current controlfile in backupset input datafile fno=00001 name=E:\ORACLE\ORADATA\ARK920\SYSTEM01.DBF input datafile fno=00002 name=E:\ORACLE\ORADATA\ARK920\UNDOTBS01.DBF input datafile fno=00004 name=E:\ORACLE\ORADATA\ARK920\EXAMPLE01.DBF input datafile fno=00009 name=E:\ORACLE\ORADATA\ARK920\XDB01.DBF input datafile fno=00005 name=E:\ORACLE\ORADATA\ARK920\INDX01.DBF input datafile fno=00008 name=E:\ORACLE\ORADATA\ARK920\USERS01.DBF input datafile fno=00003 name=E:\ORACLE\ORADATA\ARK920\DRSYS01.DBF input datafile fno=00006 name=E:\ORACLE\ORADATA\ARK920\ODM01.DBF input datafile fno=00007 name=E:\ORACLE\ORADATA\ARK920\TOOLS01.DBF channel t1: starting piece 1 at 26-DEC-04 channel t1: finished piece 1 at 26-DEC-04 piece handle=E:\RMAN_BACKUPS\ARK9201\ARK920_03G8KC6G_3 comment=NONE channel t1: backup set complete, elapsed time: 00:25:58 Finished backup at 26-DEC-04 released channel: t1

The resulting backup file(s) are in our case

dir c:\rman_backups\ark9201 Volume in drive C is 80-01-14A2 Volume Serial Number is 64DA-0BF7 Directory of c:\rman_backups\ark9201 12/26/2004 02:29p . 12/26/2004 02:29p .. 12/26/2004 02:44p 418,471,936 ARK920_03G8KC6G_3

Our first archive backup Taking only the database backup is not enough, for reasons that will be clear later. We must take a backup of the archived log as well, and the command is very similar

RMAN> run { 2> allocate channel t1 type disk; 3> backup archivelog all delete input format 'C:\RMAN_BACKUPS\ARK9201\arch_%d_%u_%s'; 4> release channel t1; 5> } allocated channel: t1 channel t1: sid=12 devtype=DISK Starting backup at 26-DEC-04 current log archived

channel t1: starting archive log backupset channel t1: specifying archive log(s) in backup set input archive log thread=1 sequence=77 recid=1 stamp=545945036 input archive log thread=1 sequence=78 recid=2 stamp=545945039 input archive log thread=1 sequence=79 recid=3 stamp=545945099 channel t1: starting piece 1 at 26-DEC-04 channel t1: finished piece 1 at 26-DEC-04 piece handle=C:\RMAN_BACKUPS\ARK9201\ARCH_ARK920_04G8KTGL_4 comment=NONE channel t1: backup set complete, elapsed time: 00:00:37 channel t1: deleting archive log(s) archive log filename=E:\ORA920\RDBMS\ARC00077.001 recid=1 stamp=545945036 archive log filename=E:\ORA920\RDBMS\ARC00078.001 recid=2 stamp=545945039 archive log filename=E:\ORA920\RDBMS\ARC00079.001 recid=3 stamp=545945099 Finished backup at 26-DEC-04 released channel: t1 RMAN>

rman on UNIX rman works the same way on a UNIX server, using an appropriate directory format:

RMAN> run { allocate channel t1 type disk; 2> backup database format '/u01/oracle/archive/%d_%u_%s'; 3> backup current controlfile format '/u01/oracle/archive/%d_%u_%s'; 4> release channel t1;} using target database controlfile instead of recovery catalog allocated channel: t1 channel t1: sid=26 devtype=DISK Starting backup at 14:29:22 18-jan-2005 channel t1: starting full datafile backupset channel t1: specifying datafile(s) in backupset input datafile fno=00001 name=/u01/oradata/ORA920/system01.dbf input datafile fno=00002 name=/u02/oradata/ORA920/undo01.dbf input datafile fno=00004 name=/u01/oradata/OMF_DATA_ORA920/o1_mf_tbs_data_0ybcd28z_.dbf input datafile fno=00003 name=/u01/oradata/OMF_DATA_ORA920/o1_mf_tools_0ybbwkp7_.dbf input datafile fno=00005 name=/u01/oradata/ORA920/oem_repository.dbf channel t1: starting piece 1 at 14:29:24 18-jan-2005 channel t1: finished piece 1 at 14:31:10 18-jan-2005 piece handle=/u01/oracle/archive/ORA920_01gah0q3_1 comment=NONE channel t1: backup set complete, elapsed time: 00:01:48 Finished backup at 14:31:11 18-jan-2005 Starting backup at 14:31:11 18-jan-2005 channel t1: starting full datafile backupset channel t1: specifying datafile(s) in backupset including current controlfile in backupset channel t1: starting piece 1 at 14:31:15 18-jan-2005

channel t1: finished piece 1 at 14:31:18 18-jan-2005 piece handle=/u01/oracle/archive/ORA920_02gah0th_2 comment=NONE channel t1: backup set complete, elapsed time: 00:00:05 Finished backup at 14:31:18 18-jan-2005 Starting Control File Autobackup at 14:31:18 18-jan-2005 piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-00 comment=NONE Finished Control File Autobackup at 14:31:22 18-jan-2005 released channel: t1 RMAN> host ; oracle@edc-se[on pts/4] total 562128 -rw-r----1 oracle ORA920_01gah0q3_1 -rw-r----1 oracle oracle@edc-se[on pts/4] host command complete

ls -ltr /u01/oracle/archive/ dba

282476544 Jan 18 14:31

dba exit

5177344 Jan 18 14:31 ORA920_02gah0th_2

Backup the controlfile RMAN> run { allocate channel t1 type disk; 2> backup current controlfile format '/u01/oracle/archive/controlf_%d_%u_%s'; 3> release channel t1;} allocated channel: t1 channel t1: sid=26 devtype=DISK Starting backup at 14:33:59 18-jan-2005 channel t1: starting full datafile backupset channel t1: specifying datafile(s) in backupset including current controlfile in backupset channel t1: starting piece 1 at 14:34:00 18-jan-2005 channel t1: finished piece 1 at 14:34:03 18-jan-2005 piece handle=/u01/oracle/archive/controlf_ORA920_04gah12n_4 comment=NONE channel t1: backup set complete, elapsed time: 00:00:04 Finished backup at 14:34:03 18-jan-2005 Starting Control File Autobackup at 14:34:03 18-jan-2005 piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-01 comment=NONE Finished Control File Autobackup at 14:34:07 18-jan-2005 released channel: t1

Here is a backup of archived log files

RMAN> run { allocate channel t1 type disk; 2> backup archivelog all format '/u01/oracle/archive/archive_%d_%u_%s' delete input; 3> release channel t1;} allocated channel: t1

channel t1: sid=26 devtype=DISK Starting backup at 14:42:00 18-jan-2005 current log archived channel t1: starting archive log backupset channel t1: specifying archive log(s) in backup set input archive log thread=1 sequence=9 recid=1 stamp=547477014 input archive log thread=1 sequence=10 recid=2 stamp=547562627 input archive log thread=1 sequence=11 recid=3 stamp=547562680 input archive log thread=1 sequence=12 recid=4 stamp=547580133 input archive log thread=1 sequence=13 recid=5 stamp=547623518 input archive log thread=1 sequence=14 recid=6 stamp=547744760 input archive log thread=1 sequence=15 recid=7 stamp=547865707 input archive log thread=1 sequence=16 recid=8 stamp=547915324 channel t1: starting piece 1 at 14:42:08 18-jan-2005 channel t1: finished piece 1 at 14:43:03 18-jan-2005 piece handle=/u01/oracle/archive/archive_ORA920_06gah1hv_6 comment=NONE channel t1: backup set complete, elapsed time: 00:00:56 channel t1: deleting archive log(s) archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_9.arc recid=1 stamp=547477014 archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_10.arc recid=2 stamp=547562627 archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_11.arc recid=3 stamp=547562680 archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_12.arc recid=4 stamp=547580133 archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_13.arc recid=5 stamp=547623518 archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_14.arc recid=6 stamp=547744760 archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_15.arc recid=7 stamp=547865707 archive log filename=/optware/oracle/admin/ORA920/arch/ORA920_1_16.arc recid=8 stamp=547915324 Finished backup at 14:43:04 18-jan-2005 Starting Control File Autobackup at 14:43:04 18-jan-2005 piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-02 comment=NONE Finished Control File Autobackup at 14:43:09 18-jan-2005 released channel: t1

The rman clauses from logseq and until logseq Sometimes you will want to backup only certain archived logs and not all. This is the case, for example, when you ran out of space and you were forced to move the archived logs on disk onto another directory. In this case the clause "archivelog all" would give an error, because rman will not be able to find the files.

run { allocate channel dev1 type 'sbt_tape'; backup archivelog from logseq 12 until logseq 15 thread 1 delete input;

}

Keeping an eye on the list above, the command would back up the archived redo logs from ORA920_1_12.arc to ORA920_1_15.arc Remember that the name depends on the parameter ... in your init.ora In our case the values is ...

RMAN> show 2> all; RMAN configuration parameters are: CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default CONFIGURE BACKUP OPTIMIZATION OFF; # default CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default CONFIGURE CONTROLFILE AUTOBACKUP ON; CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default CONFIGURE DEVICE TYPE DISK PARALLELISM 1; # default CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE MAXSETSIZE TO UNLIMITED; # default CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/optware/oracle/9.2.0.1/dbs/snapcf_ORA920.f'; # default RMAN> list backupset ;

List of Backup Sets =================== BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ -------------------1 Full 269M DISK 00:01:41 14:31:04 18-jan-2005 BP Key: 1 Status: AVAILABLE Tag: TAG20050118T142923 Piece Name: /u01/oracle/archive/ORA920_01gah0q3_1 List of Datafiles in backup set 1 File LV Type Ckp SCN Ckp Time Name ---- -- ---- ---------- -------------------- ---1 Full 562946 14:29:24 18-jan-2005 /u01/oradata/ORA920/system01.dbf 2 Full 562946 14:29:24 18-jan-2005 /u02/oradata/ORA920/undo01.dbf 3 Full 562946 14:29:24 18-jan-2005 /u01/oradata/OMF_DATA_ORA920/o1_mf_tools_0ybbwkp7_.dbf 4 Full 562946 14:29:24 18-jan-2005 /u01/oradata/OMF_DATA_ORA920/o1_mf_tbs_data_0ybcd28z_.dbf 5 Full 562946 14:29:24 18-jan-2005 /u01/oradata/ORA920/oem_repository.dbf BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ -------------------2 Full 4M DISK 00:00:03 14:31:16 18-jan-2005 BP Key: 2 Status: AVAILABLE Tag: TAG20050118T143112 Piece Name: /u01/oracle/archive/ORA920_02gah0th_2 Controlfile Included: Ckp SCN: 563054 Ckp time: 14:31:13 18jan-2005 BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ -------------------3 Full 4M DISK 00:00:02 14:31:20 18-jan-2005 BP Key: 3 Status: AVAILABLE Tag: Piece Name: /optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-00

Controlfile Included: Ckp SCN: 563071 jan-2005

Ckp time: 14:31:18 18-

BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ -------------------4 Full 4M DISK 00:00:03 14:34:02 18-jan-2005 BP Key: 4 Status: AVAILABLE Tag: TAG20050118T143359 Piece Name: /u01/oracle/archive/controlf_ORA920_04gah12n_4 Controlfile Included: Ckp SCN: 563233 Ckp time: 14:33:59 18jan-2005 BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ -------------------5 Full 4M DISK 00:00:02 14:34:06 18-jan-2005 BP Key: 5 Status: AVAILABLE Tag: Piece Name: /optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-01 Controlfile Included: Ckp SCN: 563233 Ckp time: 14:33:59 18jan-2005 BS Key Size Device Type Elapsed Time Completion Time ------- ---------- ----------- ------------ -------------------6 203M DISK 00:00:56 14:43:03 18-jan-2005 BP Key: 6 Status: AVAILABLE Tag: TAG20050118T144206 Piece Name: /u01/oracle/archive/archive_ORA920_06gah1hv_6 List of Archived Logs in backup set 6 Thrd Seq Low SCN Low Time ---- ------- ---------- -------------------1 9 118838 12:18:25 13-jan-2005 jan-2005 1 10 124198 12:56:51 13-jan-2005 jan-2005 1 11 208951 12:43:38 14-jan-2005 jan-2005 1 12 209145 12:44:30 14-jan-2005 jan-2005 1 13 227914 17:35:21 14-jan-2005 jan-2005 1 14 271564 05:38:29 15-jan-2005 jan-2005 1 15 392830 15:19:13 16-jan-2005 jan-2005 1 16 514138 00:55:00 18-jan-2005 jan-2005

Next SCN Next Time ---------- --------124198 12:56:51 13208951

12:43:38 14-

209145

12:44:30 14-

227914

17:35:21 14-

271564

05:38:29 15-

392830

15:19:13 16-

514138

00:55:00 18-

563727

14:42:00 18-

BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ -------------------7 Full 4M DISK 00:00:02 14:43:07 18-jan-2005 BP Key: 7 Status: AVAILABLE Tag: Piece Name: /optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-02 Controlfile Included: Ckp SCN: 563788 Ckp time: 14:43:03 18jan-2005 RMAN>

Are we sure that our backups do not contain corrupted blocks There is, of course, little point in taking backups if we don't know whether they are physically usable, i.e. all their blocks are clean. rman offer the command validate

RMAN> validate backupset 6;

allocated channel: ORA_DISK_1 channel ORA_DISK_1: sid=26 devtype=DISK channel ORA_DISK_1: starting validation of archive log backupset channel ORA_DISK_1: restored backup piece 1 piece handle=/u01/oracle/archive/archive_ORA920_06gah1hv_6 tag=TAG20050118T144206 params=NULL channel ORA_DISK_1: validation complete RMAN> validate backupset 5; using channel ORA_DISK_1 channel ORA_DISK_1: starting validation of datafile backupset channel ORA_DISK_1: restored backup piece 1 piece handle=/optware/oracle/9.2.0.1/dbs/c-2554142925-20050118-01 tag=null params=NULL channel ORA_DISK_1: validation complete RMAN>

validate backupset 4;

using channel ORA_DISK_1 channel ORA_DISK_1: starting validation of datafile backupset channel ORA_DISK_1: restored backup piece 1 piece handle=/u01/oracle/archive/controlf_ORA920_04gah12n_4 tag=TAG20050118T143359 params=NULL channel ORA_DISK_1: validation complete

Clause not backed up since time There are cases when backups fail and the stand becomes unclear. To avoid any risk the clause not backed up since time can be used. This commands makes rman save only the datafiles (or the archived logs) that haven't be backed up for a certain span of time and skip the rest.

RMAN> run { allocate channel t1 type disk; 2> backup database format '/u01/oracle/archive/%d_%u_%s' not backed up since time 'sysdate -5'; 3> release channel t1;} allocated channel: t1 channel t1: sid=25 devtype=DISK Starting skipping skipping skipping skipping skipping Finished

backup at 15:41:31 21-jan-2005 datafile 1; already backed up on datafile 2; already backed up on datafile 3; already backed up on datafile 4; already backed up on datafile 5; already backed up on backup at 15:41:33 21-jan-2005

20:38:51 20:38:51 20:38:51 20:38:51 20:38:51

18-jan-2005 18-jan-2005 18-jan-2005 18-jan-2005 18-jan-2005

released channel: t1 An examples on how to restore a data file can be found on the first article about restoration

The basics of an rman restore Care is needed for rman restores because they will normally overwrite existing datafiles; before taking any action it is imperative to understand why a restore is needed and how to proceed. In our opinion it is better to explain basic rman concepts with understandable examples.

Example 1: a restoration is needed because a datafile is lost. 1. In order to restore a database or a data file some kind of backup must have been taken; in our exaple, let's suppose that our database consists of three datafiles and that a backup is taken on Mar 10th 2005 in the afternoon.

"> The stand of the datafiles can differ because in general it is not guaranteed that the very last changes in the database have actually been written to disk. This is why the timestamps 15:09 and 15:15 have been used in the picture. 2. During the week archive log backups are taken daily, for example to tape using the command:

run { allocate channel t1 type 'sbt_tape'; backup archivelog all delete input; release channel t1; }

3. A week later a disk crashes and the database file number 3 is lost:

In this example, it must be understood that only the DBFile3 has to be restored, without overwriting the datafiles that have not been affected by the crash, or a week’s work would be lost. The tablespace containing the damaged file should be put offline with the SQLPLUS command:

SQLPLUS / AS SYSDBA ALTER TABLESPACE this_tbs OFFLINE IMMEDIATE; rman target / catalog rman@MYCATALOG rman> run { allocate channel t1 type 'sbt_tape'; restore tablespace this_tbs; release channel t1; }

If we are sure that only the datafile 3 has been damaged and not the rest of the tablespace this_tbs, we could use instead:

rman> run { allocate channel t1 type 'sbt_tape'; restore datafile 3; release channel t1; }

It is now obvious that we have a mish-mash of contents, with tablespace up-to-date and a tablespace my_tbs with contents as of one week before. In this situation, Oracle will refuse to bring the tablespace online, probably with the error message 'datafile 'DBFIle3' needs to be recovered'. How is it possible to bring the tablespace my_tbs at the same stand as for the other tablespaces? It is possible because we have the archive logs backups, that contain all the changes to the database made between Mar 10th and Mar 17th. The command needed is:

run { allocate channel t1 type 'sbt_tape'; recover database; release channel t1; } During the execution of this command, rman will restore the archived redo logs and apply them to the damaged tablespace. At the end of the process the tablespace can be brought online and the database is back to normal.

Identifying Host Names and IP Addresses This article presents a mixed bag of Oracle functionality relating to the identification of host names and IP addresses for Oracle clients and servers. • • • •

UTL_INADDR SYS_CONTEXT V$INSTANCE V$SESSION

UTL_INADDR The UTL_INADDR package was introduced in Oracle 8.1.6 to provide a means of retrieving host names and IP addresses of remote hosts from PL/SQL. The GET_HOST_ADDRESS function returns the IP address of the specified host name. SQL> SELECT UTL_INADDR.get_host_address('bart') FROM dual; UTL_INADDR.GET_HOST_ADDRESS('BART') ------------------------------------------------------------------------------192.168.2.4 SQL> The IP address of the database server is returned if the specified host name is NULL or is omitted. SQL> SELECT UTL_INADDR.get_host_address from dual; GET_HOST_ADDRESS ------------------------------------------------------------------------------192.168.2.5 SQL> An error is returned if the specified host name is not recognized.

SQL> SELECT UTL_INADDR.get_host_address('banana') from dual; SELECT UTL_INADDR.get_host_address('banana') from dual * ERROR at line 1: ORA-29257: host banana unknown ORA-06512: at "SYS.UTL_INADDR", line 19 ORA-06512: at "SYS.UTL_INADDR", line 40 ORA-06512: at line 1 SQL> The GET_HOST_NAME function returns the host name of the specified IP address. SQL> SELECT UTL_INADDR.get_host_name('192.168.2.4') FROM dual; UTL_INADDR.GET_HOST_NAME('192.168.2.4') ------------------------------------------------------------------------------bart SQL> The host name of the database server is returned if the specified IP address is NULL or omitted. SQL> SELECT UTL_INADDR.get_host_name FROM dual; GET_HOST_NAME ------------------------------------------------------------------------------C4210gR2 1 row selected. SQL> An error is returned if the specified IP address is not recognized. SQL> SELECT UTL_INADDR.get_host_name('1.1.1.1') FROM dual; SELECT UTL_INADDR.get_host_name('1.1.1.1') FROM dual * ERROR at line 1: ORA-29257: host 1.1.1.1 unknown ORA-06512: at "SYS.UTL_INADDR", line 4 ORA-06512: at "SYS.UTL_INADDR", line 35 ORA-06512: at line 1 SQL>

SYS_CONTEXT The SYS_CONTEXT function is able to return the following host and IP address information for the current session: • • • •

TERMINAL - An operating system identifier for the current session. This is often the client machine name. HOST - The host name of the client machine. IP_ADDRESS - The IP address of the client machine. SERVER_HOST - The host name of the server running the database instance.

The following examples show the typical output for each variant. SQL> SELECT SYS_CONTEXT('USERENV','TERMINAL') FROM dual; SYS_CONTEXT('USERENV','TERMINAL') -------------------------------------------------------------------marge 1 row selected. SQL> SELECT SYS_CONTEXT('USERENV','HOST') FROM dual; SYS_CONTEXT('USERENV','HOST') -------------------------------------------------------------------marge 1 row selected. SQL> SELECT SYS_CONTEXT('USERENV','IP_ADDRESS') FROM dual; SYS_CONTEXT('USERENV','IP_ADDRESS') -------------------------------------------------------------------192.168.2.3 1 row selected. SQL> SELECT SYS_CONTEXT('USERENV','SERVER_HOST') FROM dual; SYS_CONTEXT('USERENV','SERVER_HOST') -------------------------------------------------------------------C4210gr2 1 row selected. SQL>

V$INSTANCE The HOST_NAME column of the V$INSTANCE view contains the host name of the server running the instance. SQL> SELECT host_name FROM v$instance; HOST_NAME -----------------------------------------------C4210gR2 1 row selected. SQL>

V$SESSION The V$SESSION view contains the following host information for all database sessions: •

TERMINAL - The operating system terminal name for the client. This is often set to the client machine name.



MACHINE - The operating system name for the client machine. This may include the domain name if present.

The following examples show the typical output for each column. SQL> SELECT terminal, machine FROM v$session WHERE username = 'TIM_HALL'; TERMINAL MACHINE --------------------------------------------------------------------------------MARGE ORACLE-BASE\MARGE 1 row selected.

Oracle Flashback Wrong DML: what to do when data is deleted by mistake. In Oracle the rolling back a transaction is possible before the final commit or up to a savepoint, but the database offers another chance, which is probably little known, and it is the Flashback. The essential requirement is that the database is using UNDO tablespaces and that the undo_retention is high enough. Generally speaking, rolling back the changes is only possible if the correction is made not later than the number of seconds specified by the parameter undo_retention. SQL> show parameter undo NAME TYPE ------------------------------------ ---------------------------------------undo_management string undo_retention integer undo_suppress_errors boolean undo_tablespace string In this example a table is created and populated SQL> CREATE TABLE t_a (c1 number); SQL> INSERT INTO t_a values (1); SQL> INSERT INTO t_a values (2); SQL> INSERT INTO t_a values (3); 1 row created. SQL> INSERT INTO t_a values (4); SQL> INSERT INTO t_a values (5); 1 row created.

VALUE AUTO 1800 FALSE UNDOTBS1

SQL> COMMIT; Commit complete. SQL> ALTER SESSION SET NLS_DATE_FORMAT ='hh24:mi:ss dd-mon-yyyy'; Session altered. SQL> SELECT SYSDATE FROM DUAL; SYSDATE -------------------12:53:43 31-dec-2004 SQL> DELETE t_a; 5 rows deleted SQL> COMMIT Commit completed SQL> SELECT COUNT(*) FROM t_a; COUNT(*) ---------0 The deletion was a blunder and the problem is realized 20 minutes later; since the undo_retention is 1800 seconds (30 minutes) it is probably still possible to rescue the data using the close "AS OF TIMESTAMP" and specifying a time interval in the past. SQL> SELECT SYSDATE FROM DUAL; SYSDATE -------------------13:11:20 31-dec-2004 SQL> SELECT * FROM t_a 2 AS OF TIMESTAMP (sysdate - interval '30' minute); C1 ---------1 2 3 4 5 By specifying SELECT AS OF TIMESTAMP we are still able to access the records that have been deleted, even if the table is now empty. SQL> SELECT COUNT(*) FROM t_a;

COUNT(*) ---------0 SQL> CREATE TABLE t_b AS (SELECT * FROM t_a AS OF TIMESTAMP (SYSDATE INTERVAL '30' MINUTE); Table created. SQL> SELECT COUNT(*) FROM t_b; COUNT(*) ---------5 SQL> SELECT * FROM t_b; C1 ---------1 2 3 4 5 It was therefore possible to restore the contents as they were 20 minutes before. If the table is dropped, unfortunately no rollback is possible. SQL> SQL> DROP TABLE t_a; Table dropped. SQL> CREATE TABLE t_c AS (SELECT * FROM t_a AS OF TIMESTAMP (SYSDATE INTERVAL '40' MINUTE)); CREATE TABLE t_c AS (SELECT * FROM t_a AS OF TIMESTAMP (SYSDATE INTERVAL '40' MINUTE)) * ERROR at line 1: ORA-00942: table or view does not exist The Oracle Flashback is therefore one of the ways of restoring deleted or wrongly updated records; another tool is the Oracle LogMiner, which offers a graphical interface.

Understanding checkpoints, system change number SCN and recovery

How does Oracle know that a database needs recovering? What checks does it carry out when a database is opened? This article explains a couple of common cases of Oracle recovery and the role of the system change number SCN in the recovery.

What is the SCN system change number? Every modification in the database (for example, an update) is given an unique system change number SCN, which is, therefore, an ever increasing integer. The latest SCN is immediately recorded in the controlfile. The modifications are logged in the online redo log by the log writer LGWR, but not necessarily in the datafiles. Infact, there are times when the modified data is still in memory in a block that has become "dirty", meaning that it no longer contains the same data as the block in the datafile. There are moments, though, when all modifications are written onto all the datafiles: this happens by a checkpoint, shutdown of the database, when a redo log is switched. At that point the header of the datafiles is updated with the SCN of the system, which will be the same as recorded in the controlfile. The Oracle process CKPT (the checkpoint process) is responsible for this update. When the checkpoint is completed the database reaches a "consistent" state, meaning that it is clean, there is no contraddiction between the SCN and timestamps of the various components.

The V$DATAFILE_HEADER dynamic view For the purpose of our discussion, it is important to familiarise ourselves with the dynamic view V$DATAFILE_HEADER, which contains the information that Oracle writes in the datafile headers. SQL> descr V$DATAFILE_HEADER Name Null? Type ----------------------------------------- ----------------------------------FILE# NUMBER STATUS VARCHAR2(7) ERROR VARCHAR2(18) FORMAT NUMBER RECOVER VARCHAR2(3) FUZZY VARCHAR2(3) CREATION_CHANGE# NUMBER CREATION_TIME DATE TABLESPACE_NAME VARCHAR2(30) TS# NUMBER RFILE# NUMBER RESETLOGS_CHANGE# NUMBER RESETLOGS_TIME DATE CHECKPOINT_CHANGE# NUMBER CHECKPOINT_TIME DATE CHECKPOINT_COUNT NUMBER BYTES NUMBER BLOCKS NUMBER NAME VARCHAR2(513) SQL> column CHECKPOINT_CHANGE# format 99999999999999 SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER;

FILE# CHECKPOINT_TIME CHECKPOINT_CHANGE# ---------- -------------------- -----------------1 13:47:58 02-AUG-2004 6488359 2 13:47:58 02-AUG-2004 6488359 3 13:47:58 02-AUG-2004 6488359 4 13:47:58 02-AUG-2004 6488359 5 13:47:58 02-AUG-2004 6488359 6 13:47:58 02-AUG-2004 6488359 8 13:47:58 02-AUG-2004 6488359 9 13:47:58 02-AUG-2004 6488359 8 rows selected. The view v$database contains information that comes from the controlfile. SQL> descr v$database Name Null? Type ----------------------------------------- ----------------------------------DBID NUMBER NAME VARCHAR2(9) CREATED DATE RESETLOGS_CHANGE# NUMBER RESETLOGS_TIME DATE PRIOR_RESETLOGS_CHANGE# NUMBER PRIOR_RESETLOGS_TIME DATE LOG_MODE VARCHAR2(12) CHECKPOINT_CHANGE# NUMBER ARCHIVE_CHANGE# NUMBER CONTROLFILE_TYPE VARCHAR2(7) CONTROLFILE_CREATED DATE CONTROLFILE_SEQUENCE# NUMBER CONTROLFILE_CHANGE# NUMBER CONTROLFILE_TIME DATE OPEN_RESETLOGS VARCHAR2(11) VERSION_TIME DATE OPEN_MODE VARCHAR2(10) PROTECTION_MODE VARCHAR2(20) PROTECTION_LEVEL VARCHAR2(20) REMOTE_ARCHIVE VARCHAR2(8) ACTIVATION# NUMBER DATABASE_ROLE VARCHAR2(16) ARCHIVELOG_CHANGE# NUMBER SWITCHOVER_STATUS VARCHAR2(18) DATAGUARD_BROKER VARCHAR2(8) GUARD_STATUS VARCHAR2(7) SUPPLEMENTAL_LOG_DATA_MIN VARCHAR2(3) SUPPLEMENTAL_LOG_DATA_PK VARCHAR2(3) SUPPLEMENTAL_LOG_DATA_UI VARCHAR2(3) FORCE_LOGGING VARCHAR2(3) SQL> select CONTROLFILE_CHANGE# from v$database; CONTROLFILE_CHANGE# ------------------6488361 SQL> select CHECKPOINT_CHANGE# from v$database; CHECKPOINT_CHANGE#

-----------------6488359 We notice that the CHECKPOINT_CHANGE# in the datafile headers is identical to the CHECKPOINT_CHANGE# in the controlfile(s) because when the queries were executed, the database had just been opened..

1 case: datafile that needs recovery To understand how things work, let's make a couple of experiments. Warning: these are operation that should be avoided in real life. Their sole purpose here is to explain the concept, they are not recommended procedures. Being on UNIX, it is possible to make copies of datafiles and controlfiles when the database is open. Since we don't want to break anything, our database is in archive mode, we have already taken a good rman backup and we are therefore ready to restore if something goes wrong. These are the steps of our exercise to simulate the restore and recovery of a datafile:

1. 2. 3. 4.

A copy of one of the datafiles is taken. Normal activity continues. The database is closed. The copy taken at step 1 is put back onto its original place; this file is therefore "older" than the others (even if the timestamp might be more recent) 5. The database is open Let's have a look at our test database: 1* select file#, name from v$datafile FILE# NAME ---------------------------------------------------------------------------------------1 /u05/oradata/DEVDB/system01DEVDB.dbf 2 /u04/oradata/DEVDB/rbs01DEVDB.dbf 3 /u05/oradata/DEVDB/temp01DEVDB.dbf 4 /u04/oradata/DEVDB/tools01DEVDB.dbf 5 /u05/oradata/DEVDB/users01DEVDB.dbf 6 /u05/oradata/DEVDB/data01DEVDB.dbf 8 /u04/oradata/DEVDB/appidx01DEVDB.dbf 9 /u05/oradata/DEVDB/appdata01DEVDB.dbf 8 rows selected. We create a table t_test before copying the datafile and a table t_test2 after copying it: SQL> create table t_test(c1 number) tablespace tools; Table created. SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER; FILE# ---------1 2 3 4 5 6 8 9

CHECKPOINT_TIME CHECKPOINT_CHANGE# -------------------- -----------------13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359

8 rows selected. SQL> insert into t_test values(1); 1 row created. SQL> / 1 row created. SQL> commit;

Commit complete. SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER; FILE# ---------1 2 3 4 5 6 8 9

CHECKPOINT_TIME CHECKPOINT_CHANGE# -------------------- -----------------13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359 13:47:58 02-AUG-2004 6488359

8 rows selected. The SCN hasn't changed yet. We now make the copy SQL> !cp /u04/oradata/DEVDB/tools01DEVDB.dbf /tmp SQL> alter system switch logfile; System altered. SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER; FILE# ---------1 2 3 4 5 6 8 9

CHECKPOINT_TIME CHECKPOINT_CHANGE# -------------------- -----------------13:57:20 02-AUG-2004 6488566 13:57:20 02-AUG-2004 6488566 13:57:20 02-AUG-2004 6488566 13:57:20 02-AUG-2004 6488566 13:57:20 02-AUG-2004 6488566 13:57:20 02-AUG-2004 6488566 13:57:20 02-AUG-2004 6488566 13:57:20 02-AUG-2004 6488566

8 rows selected. SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. SQL> exit At this point as a safety measure we take a copy /u04/oradata/DEVDB/tools01DEVDB.dbf.save of the datafile 4 as it is now and copy back from /tmp: oracle-localora@# cp /u04/oradata/DEVDB/tools01DEVDB.dbf /u04/oradata/DEVDB/tools01DEVDB.dbf.save oracle-localora@# cp /tmp/tools01DEVDB.dbf /u04/oradata/DEVDB/tools01DEVDB.dbf cp: overwrite /u04/oradata/DEVDB/tools01DEVDB.dbf (yes/no)? y

The database can now be opened: let's see what happens: SQL> startup ORACLE instance started. Total System Global Area 189409824 bytes Fixed Size 731680 bytes Variable Size 104857600 bytes Database Buffers 81920000 bytes Redo Buffers 1900544 bytes Database mounted. ORA-01113: file 4 needs media recovery ORA-01110: data file 4: '/u04/oradata/DEVDB/tools01DEVDB.dbf' So, why does Oracle (rightly) complain about file 4? Because the SCN in its header is different from the others: SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER; FILE# CHECKPOINT_TIME CHECKPOINT_CHANGE# ---------- -------------------- -----------------1 13:57:57 02-AUG-2004 6488582 2 13:57:57 02-AUG-2004 6488582 3 13:57:57 02-AUG-2004 6488582 4 13:47:58 02-AUG-2004 6488359 5 13:57:57 02-AUG-2004 6488582 6 13:57:57 02-AUG-2004 6488582 8 13:57:57 02-AUG-2004 6488582 9 13:57:57 02-AUG-2004 6488582 The datafile 4 has a "delay", since its SCN number is smaller than the one in the other datafiles and controlfile. We can also find out which archived (or online) redo logs will be necessary for the recovery: they are the ones (or one) whose FIRST_CHANGE# is less or equal 6488359 AND next_change is greater SQL> select NAME, FIRST_CHANGE# from v$archived_log where FIRST_CHANGE# >= 6488359; NAME FIRST_CHANGE# ------------------------------------------------------------------------------/opt/oracle/admin/DEVDB/arch/1_60.dbf 6488563 SQL> RECOVER DATABASE; ORA-00279: change 6488359 generated at 08/02/2004 13:47:58 needed for thread 1 ORA-00289: suggestion : /opt/oracle/admin/DEVDB/arch/1_60.dbf ORA-00280: change 6488359 for thread 1 is in sequence #60 Specify log: {=suggested | filename | AUTO | CANCEL} return] Log applied. Media recovery complete. SQL>ALTER DATABASE OPEN The database is back to normal.

[we press

2. When is command "recover database using backup controlfile" required? Let's take a copy of the controlfile before any DDL activity, such as creating a new table and populating it. This will simulate the restore of a controlfile: SQL> !cp /u01/oradata/DEVDB/ctl1DEVDB.dbf /tmp SQL> create table t_test2(c1 number) tablespace tools; Table created. SQL> insert into t_test2 1 row created.

values(10);

SQL> insert into t_test2 1 row created.

values(11);

SQL> insert into t_test2 1 row created.

values(12);

SQL> commit; Commit complete. We now shutdown the database; this will cause a checkpoint to happen, meaning that all datafiles headers will be updated with the SCN contained in the controlfile. SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. SQL> exit Before any dangerous experiment, the wise DBA will always takes a copy of the controlfile(s) oracle-localora@# cp /u01/oradata/DEVDB/ctl1DEVDB.dbf /u01/oradata/DEVDB/ctl1DEVDB.dbf.save oracle-localora@# cp /u02/oradata/DEVDB/ctl2DEVDB.dbf /u02/oradata/DEVDB/ctl2DEVDB.dbf.save oracle-localora@# cp /u03/oradata/DEVDB/ctl3DEVDB.dbf /u03/oradata/DEVDB/ctl3DEVDB.dbf.save Let's now overwrite the controlfiles with the copy we created on the /tmp directory: oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u01/oradata/DEVDB/ctl1DEVDB.dbf cp: overwrite /u01/oradata/DEVDB/ctl1DEVDB.dbf (yes/no)? y oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u02/oradata/DEVDB/ctl2DEVDB.dbf cp: overwrite /u02/oradata/DEVDB/ctl2DEVDB.dbf (yes/no)? y oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u03/oradata/DEVDB/ctl3DEVDB.dbf cp: overwrite /u03/oradata/DEVDB/ctl3DEVDB.dbf (yes/no)? y oracle-localora@# Let's try to open the database. SQL> startup ORACLE instance started. Total System Global Area 189409824 bytes Fixed Size 731680 bytes Variable Size 104857600 bytes Database Buffers 81920000 bytes Redo Buffers 1900544 bytes Database mounted. ORA-01122: database file 1 failed verification check

ORA-01110: data file 1: '/u05/oradata/DEVDB/system01DEVDB.dbf' ORA-01207: file is more recent than controlfile - old controlfile SQL> select FILE#, CHECKPOINT_TIME, CHECKPOINT_CHANGE# from V$DATAFILE_HEADER; FILE# ---------1 2 3 4 5 6 8 9

CHECKPOINT_TIME CHECKPOINT_CHANGE# -------------------- -----------------14:20:22 02-AUG-2004 6488634 14:20:22 02-AUG-2004 6488634 14:20:22 02-AUG-2004 6488634 14:20:22 02-AUG-2004 6488634 14:20:22 02-AUG-2004 6488634 14:20:22 02-AUG-2004 6488634 14:20:22 02-AUG-2004 6488634 14:20:22 02-AUG-2004 6488634

8 rows selected. SQL> select CHECKPOINT_CHANGE# from v$database; CHECKPOINT_CHANGE# -----------------6488582 Oracle complains saying the the controlfile is an old copy judging from the SCN; during other kinds of recovery you will be prompted that the command "recover database using backup controlfile" is needed essentially for the same reason. Let's see what happens if we try a normal recovery: SQL> recover database; ORA-00283: recovery session canceled due to errors ORA-01122: database file 1 failed verification check ORA-01110: data file 1: '/u05/oradata/DEVDB/system01DEVDB.dbf' ORA-01207: file is more recent than controlfile - old controlfile Instead we use: SQL> recover database using backup controlfile; ORA-00279: change 6488582 generated at 08/02/2004 13:57:20 needed for thread 1 ORA-00289: suggestion : /opt/oracle/admin/DEVDB/arch/1_61.dbf ORA-00280: change 6488582 for thread 1 is in sequence #61 Specify log: {=suggested | filename | AUTO | CANCEL} ORA-00308: cannot open archived log '/opt/oracle/admin/DEVDB/arch/1_61.dbf' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 This is an interesting case because the redo log needed for the recovery hasn't been archived yet and it is one of the online redo logs. We can see it because the latest archived log is sequence 60, but Oracle is looking for the sequence 61. SQL> !ls -ltr /opt/oracle/admin/DEVDB/arch total 10298 -rw-r----1 oracle dba 18944 Aug 2 13:57 1_59.dbf

-rw-r----1 oracle dba SQL> select member from v$logfile;

1536 Aug

2 13:57 1_60.dbf

MEMBER ------------------------------------------------------------------------------/u03/oradata/DEVDB/redog2m1DEVDB.dbf /u04/oradata/DEVDB/redog2m2DEVDB.dbf /u01/oradata/DEVDB/redog1m1DEVDB.dbf /u02/oradata/DEVDB/redog1m2DEVDB.dbf SQL> recover database using backup controlfile; ORA-00279: change 6488582 generated at 08/02/2004 13:57:20 needed for thread 1 ORA-00289: suggestion : /opt/oracle/admin/DEVDB/arch/1_61.dbf ORA-00280: change 6488582 for thread 1 is in sequence #61 Specify log: {=suggested | filename | AUTO | CANCEL} [we type the name of the online redo log] /u03/oradata/DEVDB/redog2m1DEVDB.dbf Log applied. Media recovery complete. We have now almost finished; the final remark is that when recovering with a backup controlfile, the database must be opened with the clause "resetlogs" SQL> alter database open; alter database open * ERROR at line 1: ORA-01589: must use RESETLOGS or NORESETLOGS option for database open SQL> alter database open noresetlogs; alter database open noresetlogs * ERROR at line 1: ORA-01588: must use RESETLOGS option for database open SQL> alter database open resetlogs; Database altered. SQL> select * from t_test2; C1 ---------10 11 12 SQL> select * from t_test; C1 ---------1 1 The contents of the database are as expected.

Error message: port already in use Sometimes it happens that an installation fails with the error message "port already in use" (this can be the case with the Oracle snmpx daemon on UNIX systems). If you don't know which TCP port is generating the error, the following UNIX command lists those currently being used. oracle@:PROD1 >netstat -an | grep LISTEN | sort -k 1 *.111 *.* 0 0 24576 LISTEN *.13 *.* 0 0 24576 LISTEN *.161 *.* 0 0 24576 LISTEN *.19 *.* 0 0 24576 LISTEN *.2049 *.* 0 0 24576 LISTEN *.21 *.* 0 0 24576 LISTEN *.23 *.* 0 0 24576 LISTEN *.2410 *.* 0 0 24576 LISTEN ... ... *.898 *.* 0 0 24576 LISTEN *.9 *.* 0 0 24576 LISTEN *.9991 *.* 0 0 24576 LISTEN 179.146.111.76.1521 *.* 0 0 24576 LISTEN

0 0 0 0 0 0 0 0 0 0 0 0

This list shows the port 21 (ftp), 23 (telnet), 1521 (an Oracle listener) etc. If you suspect of a particular TCP port (for example 161) go for it. netstat -an | grep 161 *.161

*.*

LISTEN

0

0 24576

0

Find the process that is using the TCP port It is now necessary to find out which process is actually using the TCP port. On the Internet an open source tool called lsof is available, but if you don't feel like installing another tool, the following short script can do the work. The script goes through the process tables and finds out which one is already using a certain port. You need root privileges to run this script. for PROC in /proc/*

do

echo $PROC pfiles -F $PROC | grep port |grep 161 done A sample output is .. /proc/76 /proc/777 sockname: sockname: sockname: sockname: sockname: /proc/779 /proc/784 ...

AF_INET AF_INET AF_INET AF_INET AF_INET

0.0.0.0 port: 161 179.146.111.98 port: 179.146.111.101 port: 179.146.111.102 port: 179.146.111.100 port:

161 161 161 161

We have now got the process ID 777 and it is easy to determine what it is doing : app@serverdaddy[on pts/1] ps -ef | grep 777 root 777 717 0 Dec 05 ? 11:28 /opt/buw/bin/snmpd -f -x -s g_mon -l Our investigation is therefore completed: the TCP port 161 is being used by the process /opt/buw/bin/snmpd.

A real case: Could not start agent. Initialization failure Sometimes it happens, especially after an updrade, to come across the error: Failed while initializing Collection Service Error initializing subsystems Agent exited at Thu Jan 27 11:00:54 MET 2005 with return value 55 Could not start agent. Initialization failure oracle@zhru09 [oracle] # pwd /opt/oracle/9.2.0.5/network/log oracle@dbserver [oracle] # ls -ltr total 6 -rw-rw---1 oracle dba -rw-r----1 oracle dba -rw-rw---1 oracle dba The contents of dbsnmp.nohup are:

278 Jan 27 11:00 nmiconf.log 267 Jan 27 11:00 dbsnmp.nohup 585 Jan 27 11:00 dbsnmp.log

oracle@dbserver [oracle] # cat dbsnmp.nohup ------------------------ Thu Jan 27 11:00:49 MET 2005 -----------------------Failed while initializing Collection Service Error initializing subsystems Agent exited at Thu Jan 27 11:00:54 MET 2005 with return value 55 Could not start agent. Initialization failure The contents of dbsnmp.log are:

oracle@dbserver [oracle] # cat dbsnmp.log DBSNMP for Solaris: Version 9.2.0.5.0 - Production on 27-JAN-2005 11:00:52 Copyright (c) 2003 Oracle Corporation.

All rights reserved.

System parameter file is /var/opt/oracle/snmp_ro.ora Log messages written to /opt/oracle/9.2.0.5/network/log/dbsnmp.log Trace information written to /opt/oracle/9.2.0.5/network/trace/dbsnmp_6038.trc Trace level is currently 0 NMS-00001: Warning: dbsnmp unable to connect to SNMP master agent 11:00:54 27/01/2005 ODG-05019: Error: failed to bind socket 11:00:54 27/01/2005 ODG-05083: Error: failed to allocate listening port/socket The explanation in the Oracle Metalink for the error ODG-05083 is: Error: ODG-5083 Text: Error: failed to allocate listening port/socket -------------------------------------------------------------------------Cause: an error occurred while attempting to open the Data Gatherer socket (1808). Action: ensure that no other process is using this socket and retry the operation Therefore, withe script above, let's check whether some other process is already using the port 1808: sysuser@dbserver[on pts/5] netstat -a |egrep -i 1808 dbserver.1808 *.* 0 LISTEN for PROC in /proc/* do echo $PROC pfiles -F $PROC | grep port |grep 1808 done ... /proc/1162 /proc/1165 /proc/3687 /proc/3689 /proc/3691 /proc/4429 sockname: AF_INET 169.166.228.84 /proc/5632 ... ...

port: 1808

0 24576

0

root@dbserver[on pts/7]# ps -aef|grep -i 4429 root 7961 6922 0 11:36:15 pts/7 0:00 grep -i 4429 oracle 4429 1 0 Dec 09 ? 43:56 /u00/app/oracle/product/8.1.7/bin/vppdc root@dbserver[on pts/7]# kill -3 4429 root@dbserver[on pts/7]# kill -5 4429 root@dbserver[on pts/7]# kill -9 4429 The problem, therefore, was that the data gatherer of version 8.1.7 was still running. The reason why sometimes we forget this process is that the agent in 9i is also the data gatherer. Again, this is a real case.

What is a cluster? This article gives some information about intalling and using the Oracle Real Application Cluster RAC. A definition of a cluster is : "A group of computers linked to provide fast and reliable service. Cluster technologies have evolved over the past 15 years to provide servers that are both highly available and scalable. Clustering is one of several approaches that exploits parallel processing — the use of multiple subsystems as building blocks". Clusters have existed for quite a long time; the first solution was offered by DEC for its VMS operating system. Clusters on UNIX are more recent. Clustering is therefore a solution for having different computers share common devices and jointly perform tasks; they exist with or without databases. Several companies offer clustering software: Veritas, Sun, IBM, Microsoft, MySql etc. For example, the version for Windows NT can be downloaded from real application cluster for NT/2000 This means that the set of commands for managing the cluster itself depends on the specific product.

Shared storage A cluster will share some storage. There will be some software in place to manage this storage, which could be the Veritas Volume Manager, or Sun Soltice, or UNIX itself if you are using raw devices. Even if it is likely that the clustering software and the storage manager will be provided by the same vendor, it is not necessarily so.

What is an Oracle Real Application Cluster? The Oracle RAC is a solution offered by Oracle Corporation in order to build a shared database using the clustering technology; RAC and clustering are not the same thing: RAC comes on top of the clustering software, even if the two softwares are tightly interconnected.

Real Application Clusters requires that all each instance be able to access a set of devices on a shared disk subsystem. Veritas Storage Foundation (VSF) for RAC supports an Oracle-certified Cluster File System (CFS), allowing to store the files that Real Application Clusters requires directly on the cluster file system. A raw disk subsystem can be used instead if required. Veritas states that the I/O performance of the VSF matches that of the raw devices. The CFS allows the use of normal Oracle commands such as "alter datafile resize 500M". With raw devices this is not possible, because the system manager must create the device with a certain size using the Volume Manager. Analogously, the archived redo logs can be created on the CFS. Oracle Cluster can be installed also on a single node, but the cluster management must be installed in advance and running. The tool on WindowsNT/2000 and Linux is called oracm and is shipped with Oracle.

How many copies of Oracle binaries in a RAC? The Oracle binaries can be installed once on the shared storage or on each server (for example, on the system disk). Both ways are possible and your decision should be consider few factors: easy of use and possible downtime. • • •

One copy of the binaries introduces a single point of failure. One copy of the binaries is more manageable; for example, a patch has to be applied only once and not on each server. If you keep only one copy, than you have a certain outage if you have to apply a patch, because all the instances will be to be brought down. On the contrary, with binaries on each node only the instance on that particular node will have to be stopped.

Defining the instances on the two nodes of the Oracle cluster

If the cluster software is running, the installation with the Oracle Universal Installer is very simple. The installer will detect the two nodes and will ask on which node the software should be copied; if you are using a shared device for the $ORACLE_HOME the node doesn't matter. When the installation is completed, you can try to start the instance on one of the two nodes without mounting the database, which of course hasn't yet been created. The tool is the usual sqlplus, even if the new tool svrctl can be used.

RAC configuration files How many copies of init.ora should you keep? Where to define TNS_ADMIN? The new cluster systax introduced a "dot" notation in the init.ora file to specify parameters for different instances. This allows to keep only one copy of init.ora for all instances. To fix the ideas, let's say that the cluster nodes are called sercluster1 and sercluster2, the database name is EUROPE and that the instances names are EU1, EU2 etc. It the shared mountpoint is /orasoft, the ORACLE_BASE=/orasof/oracle, ORACLE_HOME=$ORACLE_BASE/9.2.0 One possible way to proceed is creating a tree $ORACLE_BASE/admin/EUROPE/... on the shared device. In particular, we will define an initEUROPE.ora on $ORACLE_BASE/admin/EUROPE/pfile The soft links must now be created: ln -s $ORACLE_BASE/admin/EUROPE/pfile/initEUROPE.ora $ORACLE_HOME/dbs/initEU1.ora ln -s $ORACLE_BASE/admin/EUROPE/pfile/initEUROPE.ora $ORACLE_HOME/dbs/initEU2.ora To set correctly the environment on the two nodes, a possible script would be #!ksh export ORACLE_BASE=/orasoft/oracle export ORACLE_HOME=$ORACLE_BASE/9.2.0 myHost=`hostname` if [ $myHost = "sercluster1" ]; then ORACLE_SID=EU1 else ORACLE_SID=EU2 fi export ORACLE_SID ... ... ... The immediate question is: how can you use the same init.ora if the instance names are different? The answer is that in a RAC environmnet it is now possible to specify a parameter with a "dot" notation. For example: cluster_database=true cluster_database_instances=2 # EU1.instance_name=EU1 EU1.instance_number=1 EU1.thread=1 EU1.undo_tablespace=UNDOEU1 EU2.instance_name=EU2 EU2.instance_number=2 EU2.thread=2 EU2.undo_tablespace=UNDOEU2

undo_management=auto

Creating the RAC database At this stage $ORACLE_HOME and $ORACLE_SID must be defined in the environment and the init.ora file must be found on $ORACLE_HOME/dbs. As we said before, on UNIX (Linux) the init.ora on $ORACLE_HOME/dbs is typically a soft link to $ORACLE_BASE/admin/$ORACLE_SID/pfile/init${ORACLE_SID}.ora For this operation, the parameter cluster_database should be set to FALSE; in fact, the first part is not different from creating a database with a single instance and this setting avoids some unnecessary complexity at this stage. cluster_database = FALSE The database can now be created using normal scripts. sqlplus '/ as sysdba' sql> startup nomount spool /orasoft/oracle/admin/EUROPE/create/logs/crDB_EUROPE_1.log startup nomount pfile ='/orasoft/oracle/admin/EUROPE/pfile/initEUROPE.ora' CREATE DATABASE EUROPE controlfile REUSE maxinstances 4 maxlogfiles 64 maxlogmembers 4 maxdatafiles 1000 maxloghistory 1000 character set AL32UTF8 national character set AL16UTF16 DATAFILE '/u00/oradata/EUROPE/EUROPE_system01.dbf' SIZE 1000M REUSE EXTENT MANAGEMENT LOCAL UNDO TABLESPACE UNDOEU1 DATAFILE '/u09/oradata/EUROPE/EUROPE_undo_T1_01.dbf' SIZE 5000M REUSE DEFAULT TEMPORARY TABLESPACE TEMP TEMPFILE '/u08/oradata/EUROPE/EUROPE_temp01.dbf' SIZE 5000M REUSE EXTENT MANAGEMENT LOCAL UNIFORM SIZE 50M LOGFILE GROUP 1 ('/u05/oradata/EUROPE/EUROPE_redo1_1T1.dbf', '/u06/oradata/EUROPE/EUROPE_redo1_2T1.dbf') SIZE 100M REUSE, GROUP 2 ('/u06/oradata/EUROPE/EUROPE_redo2_1T1.dbf', REUSE,

'/u07/oradata/EUROPE/EUROPE_redo2_2T1.dbf') SIZE 100M GROUP 3 ('/u07/oradata/EUROPE/EUROPE_redo3_1T1.dbf',

REUSE,

'/u05/oradata/EUROPE/EUROPE_redo3_2T1.dbf') SIZE 100M GROUP 4 ('/u05/oradata/EUROPE/EUROPE_redo4_1T1.dbf',

REUSE;

'/u06/oradata/EUROPE/EUROPE_redo4_2T1.dbf') SIZE 100M

spool off The second instance can start, but cannot mount nor open the database. Before this is possible, the redo logs and the UNDO tablespace for the second instance must be created ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 5 ('/u04/oradata/EUROPE/EUROPE_redo5_1T2.dbf', '/u05/oradata/EUROPE/EUROPE_redo5_2T2.dbf') SIZE 100M; ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 6 ('/u05/oradata/EUROPE/EUROPE_redo6_1T2.dbf', '/u06/oradata/EUROPE/EUROPE_redo6_2T2.dbf') SIZE 100M; ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 7 ('/u06/oradata/EUROPE/EUROPE_redo7_1T2.dbf', '/u07/oradata/EUROPE/EUROPE_redo7_2T2.dbf') SIZE 100M; ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 8 ('/u07/oradata/EUROPE/EUROPE_redo8_1T2.dbf', '/u08/oradata/EUROPE/EUROPE_redo8_2T2.dbf') SIZE 100M; ALTER DATABASE ENABLE THREAD 2; CREATE UNDO TABLESPACE UNDOEU2 DATAFILE '/u07/oradata/EUROPE/EUROPE_undo_T2_01.dbf' SIZE 5000M REUSE; Shut down the two instances Now set cluster_database = TRUE # LISTENER.ORA Network Configuration File: /optware/oracle/network/admin/listener.ora # Generated by Oracle configuration tools. LISTENER_EU1 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002)) ) SID_LIST_LISTENER_EU1 = (SID_LIST = (SID_DESC = (ORACLE_HOME = /optware/oracle/9.2.0) (SID_NAME = EU1) ) ) LISTENER_EU2 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster2)(PORT = 2002)) )

SID_LIST_LISTENER_EU2 = (SID_LIST = (SID_DESC = (ORACLE_HOME = /orasoft/oracle/9.2.0) (SID_NAME = EU2) ) ) # TNSNAMES.ORA Network Configuration File: /orasoft/oracle/network/admin/tnsnames.ora # Generated by Oracle configuration tools. EUROPE = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002)) (ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002)) (LOAD_BALANCE = yes) ) (CONNECT_DATA = (SERVICE_NAME = EUROPE) ) )

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF