Backup Admin: February 2014

Thursday, February 27, 2014

Cannot get information on file '/var/opt/omni/tmp/rcvcat.exp' Error: 2

[Major] From: OB2BAR_DMA@dbsrv01.in.com "DB001" Time: 2/23/2014 3:05:36 PM
Cannot get information on file '/var/opt/omni/tmp/rcvcat.exp' Error: 2.

[Major] From: ob2rman@dbsrv01.in.com "DB001" Time: 02/23/2014 03:05:44 PM
Backup of recovery catalog failed.

Solution:

>> You would receive this error in linux OS and clustered Oracle DB. The patch DPLNX_00094 would fix this.

Alternatively, you can try the below steps:

>> From the Integration specific option, Enable the option 'Disable recovery catalog auto backup'. This would skip the backup of recovery catalog backup from DP. Rerun the backup now. If you still see the same errors and wonder why, then move on to the next step.

>> The backup script in barlists always takes precedence from the DP GUI options. Open the spec in VI editor and Move to the end of the barlist and edit the script with value "-skip RCVCAT".

Example:
CLIENT "DB001" dbsrv01.in.com
{
-exec ob2rman.exe
-args {
"-skip RCVCAT" //Skips the recovery catalog backup
"-pre /home/user/DB001_weekly_full.rman" //Pre exec scripts if any
"-backup"
}
-input {
"run {"
"allocate channel 'dev_0' type 'sbt_tape'" //channel allocation
" parms 'ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=DB001,OB2BARLIST=dbsrv01_DB001_ONLINE)';"
"backup"
" format 'dbsrv01_DB001_ONLINE<DB001_%s:%t:%p>.dbf'"
" current controlfile;"
"}"
}
-profile
} -protect days 28 //data protection in days

Wednesday, February 26, 2014

[12:8301] Oracle Server home directory not found.

[Normal] From: BSM@cellserver01.in.com "Oracle_oradb01_Spec01" Time: 2/13/2014 7:00:06 AM

Backup session 2014/02/13-5 started.

[Normal] From: BSM@cellserver01.in.com "Oracle_oradb01_Spec01" Time: 2/13/2014 7:00:12 AM

OB2BAR application on "oradbsrv01.in.com" successfully started.

[Major] From: ob2rman@oradbsrv01.in.com "oradb01" Time: 02/13/14 07:00:20

[12:8301] Oracle Server home directory not found.

[Normal] From: BSM@cellserver01.in.com "Oracle_oradb01_Spec01" Time: 2/13/2014 7:00:21 AM

OB2BAR application on "oradbsrv01.in.com " disconnected.

[Normal] From: BSM@cellserver01.in.com "Oracle_oradb01_Spec01" Time: 2/13/2014 7:00:21 AM

Error message Interpretation:

From the error message "Oracle Server home directory not found." we can identify that the home dir was migrated to some other location. Had to double check with the DBAs.

>> Yes, the database was actually moved to another node and the integration was reconfigured to backup the new migrated database.

>> The old backup spec was disabled and can be removed after the last version of DB backup expires.

ORA-19511: Error received from media manager layer, error text: Vendor specific error: OB2_StartObjectBackup() failed ERR(-17)

RMAN-03090: Starting backup at 22-FEB-14

RMAN- 0 : channel dev_0: starting incremental level 0 datafile backupset

RMAN- 010: channel dev_0: specifying datafile(s) in backupset

RMAN- 522: input datafile fno=00007 name=/var/opt/vgdb/u01/oracle/DB007/ DB007_DB7_datas01.dbf

RMAN- 522: input datafile fno=000 name=/var/opt/vgdb/u01/oracle/DB007/ DB007_DB7_indexs01.dbf

RMAN- 522: input datafile fno=00001 name=/var/opt/vgdb07/u01/oracle/DB007/ DB007_system01.dbf

RMAN- 522: input datafile fno=00002 name=/var/opt/vgdb07/u01/oracle/DB007/ DB007_undotbs101.dbf

RMAN- 522: input datafile fno=00003 name=/var/opt/vgdb06/u01/oracle/DB007/ DB007_data01s01.dbf

RMAN- 522: input datafile fno=00004 name=/var/opt/vgdb07/u01/oracle/DB007/ DB007_indx01s01.dbf

RMAN- 522: input datafile fno=00005 name=/var/opt/vgdb07/u01/oracle/DB007/ DB007_tools01.dbf

RMAN- 522: input datafile fno=00006 name=/var/opt/vgdb06/u01/oracle/DB007/ DB007_users01.dbf

RMAN- 038: channel dev_0: starting piece 1 at 22-FEB-14

RMAN- 031: released channel: dev_0

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on dev_0 channel at 02/22/2014 21:36:50

ORA-19506: failed to create sequential file, name="Oracle8_Spec01_DB007_ONLINE< DB00701_677:840231409:1>.dbf", parms=""

ORA-27028: skgfqcre: sbtbackup returned error

ORA-19511: Error received from media manager layer, error text:

Vendor specific error: OB2_StartObjectBackup() failed ERR(-17)

Recovery Manager complete.

[Major] From: ob2rman@oraserver01 "DB00701" Time: 02/22/14 21:36:50

External utility reported error.

RMAN PID=7092

[Major] From: ob2rman@oraserver01 "DB00701" Time: 02/22/14 21:36:50

The database reported error while performing requested operation.

Solution:

>> The backup data is destined to file library which ran out of free space.

>> Created free space and reran the backup, which was successful.

Sunday, February 23, 2014

HP DP - IDB Maintenance

DataProtector IDB Maintenance

Step by Step procedure for IDB Maintenance

White Font à Steps
Green Font à Commands Used
Blue Font à Command O/P

Check for any running sessions and abort if required

Ensure no backup is running and take the IDB backup using DP.

Disable the backup schedules (/sbin/init.d/cron stop, omnitrig -stop)

Shutdown the DP with following command

/opt/omni/sbin/omnisv stop

Check the omni service (omnisv status)

Close the DP GUI & abort all DBSm Sessions

Purge DB with below commands.

a. omnidb -strip

b. omnidbutil -purge_failed_copies

c. omnidbutil -purge -filenames -force

d. omnidbutil -purge –dcbf

(If the purge has not finished, but you need to run backups simply stop the purge with omnidbutil -purge_stop and use DP as normal. )

Check for a mount point with sufficient space. Usually /var/adm/crash

Create two directories MMDB and CDB in /var/adm/crash

mkdir /var/adm/crash/MMDB /var/adm/crash/CDB

Start the DP with following command

/opt/omni/sbin/omnisv start

Create a copy of the existing sessions, medias, position information etc. with the following command

/omni/db40 # /opt/omni/sbin/omnidbutil -writedb -mmdb /var/adm/crash/MMDB -cdb /var/adm/crash/CDB -no_detail

Once the above command is successful without any errors the output will be as below:

/omni/db40 # /opt/omni/sbin/omnidbutil -writedb -mmdb /var/adm/crash/MMDB -cdb /var/adm/crash/CDB -no_detail

10/07/05 12:44:18 Exporting libraries ...

10/07/05 12:44:18 Exporting pools ...

10/07/05 12:44:18 Exporting devices ...

10/07/05 12:44:18 Exporting cartridges ...

10/07/05 12:44:18 Exporting compounds ...

10/07/05 12:44:18 Exporting media ...

10/07/05 12:44:18 Exporting sessions ...

10/07/05 12:44:19 Exporting objects and object versions ...

10/07/05 12:44:23 Exporting positions ...

Please make a copy of following Internal Database directories and

then press ENTER to return Internal Database to normal state:

"/var/opt/omni/db40/msg"

DONE!

After successful completion of writedb. Move folders /var/opt/omni/db40/dcbf* folders to /var/opt/omni/db40/dcbf*.old. Note that there can be more than one folder ex. dcbf, dcbf1, dcbf2 etc.

Compress the content of the moved folder. This is done just to create space and these folders (dcbf*.old) will be removed later once the maintenance is successful.

Restart the DP services using the below commands

/opt/omni/sbin/omnisv stop

/opt/omni/sbin/omnisv start

Perform DB Read (Reads and writes to the database) using omnidbutil.

/omni/db40 # /opt/omni/sbin/omnidbutil -readdb -mmdb /var/adm/crash/MMDB -cdb /var/adm/crash/CDB -no_detail

The above command will output messages similar to below:

/omni/db40 # /opt/omni/sbin/omnidbutil -readdb -mmdb /var/adm/crash/MMDB -cdb /var/adm/crash/CDB -no_detail

Database import will overwrite old database. All data will be lost!

Are you sure (y/n)?y

10/07/05 12:17:53 Importing libraries ...

10/07/05 12:17:53 Importing pools ...

10/07/05 12:17:53 Importing devices ...

10/07/05 12:17:53 Importing compounds ...

10/07/05 12:17:53 Importing cartridges ...

10/07/05 12:17:53 Importing media ...

10/07/05 12:17:53 Importing sessions ...

10/07/05 12:17:53 Importing objects and object versions ...

10/07/05 12:25:27 Importing positions ...

DONE!

Run following command omnidbutil -fixmpos

Once the above command is successful. Connect to DP through GUI and check the last run sessions.

enable the backup schedule (/sbin/init.d/cron start, omnitrig -start)

If everything looks fine. You can remove the dcbf*.old directories from /var/opt/omni/db40.

Note that once the DB maintenance is complete. All directory/file details will be removed from the database. Hence forth need to import the medias to browse files for any restore operation.

Saturday, February 22, 2014

[Critical] Duplicate BARCODE information from Media Agent

[Normal] From: MSM@cellservero1.in.com "MSL2024_L01" Time: 22.2.2014 10:33:56

Media session 2014/02/22-179 started.

[Normal] From: UMA@mediaserver01.in.com "MSL2024_L01" Time: 22.2.2014 10:34:07

STARTING Media Agent "MSL2024_L01"

[Critical] From: MSM@cellservero1.in.com "MSL2024_L01" Time: 22.2.2014 10:34:10

Duplicate BARCODE information from Media Agent, Not updating the Repository

[Normal] From: UMA@mediaserver01.in.com "MSL2024_L01" Time: 22.2.2014 10:34:09

COMPLETED Media Agent "MSL2024_L01"

======================================================

23 cartridges out of 23 successfully scanned.

======================================================

>> The Library model is MSL2024, but there were only 23 slots scanned and the barcode scan failed.

>> Check the Repository, It shows only 23 slots, add the 24th slot.

>> Now barcode info has valid argument to scan.

Guest Post done by: Anbu

HP DP Mount Requests 'Mount request for medium'

[Warning] From: BSM@cellserver01.in.com "Backup_spec_15" Time: 2/13/2014 9:00:27 PM
Medium "c02cbeed:aaaa:0001" in device "HP:Ultrium 2-SCSI_ Dev001"
in the pool " Dev001_media" cannot be appended.
Medium label is: "THU02a"

[Normal] From: BMA@Server01.in.com " HP:Ultrium 2-SCSI_ Dev001" Time: 2/13/2014 9:00:40 PM
Tape0:0:0:0
Medium header verification completed, 0 errors found.

This is a Warning message that DP produces and it can continue with any other good tape if found in the media pool. If not, then the backup would pause with Mount request for medium with the message like below.

[Warning] From: BSM@cellserver01.in.com "HP:Ultrium 2-SCSI_Dev001" Time: 2/13/2014 9:00:40 PM

___________________________________________________________________

Mount request for medium:

MediumId : c02cbeed:aaaa:0001

Label : MON02a

Location :

Device : HP:Ultrium 2-SCSI_Dev001

Host : Server01.in.com

Physical device : Tape0:0:0:0

___________________________________________________________________

[Normal] From: BSM@cellserver01.in.com "backup_spec_15" Time: 2/13/2014 9:00:42 PM

Starting the mount request notification script "/opt/omni/lbi/Mount.sh".

Firstly, check if the respective media pool is set to ‘Appendable’.

Secondly, copy the Medium label and go to the respective media pool. Find the media and it's status to be good.

v If the medium condition is good and DP gives out the error. Check the next step

v If the medium contains protected objects and has some space left over. Even though the medium has free space, it may either insufficient or otherwise the protected objects may be scattered within the media.

[OR]

v The media is expired but the status in ‘Fair’ or ‘Poor’ which DP may not use for backup. Check if you can reuse it by reformatting. If the media still shows ‘Poor’ condition, kindly do not use the medium. Either the backup would fail or data might not be recoverable.

To confirm the MR from DP cell server CLI

# omnistat –mount

SessionID Type Status User

==============================================

2014/01/07-9 Backup Mount Request localadmin

# omnimnt -device ' HP:Ultrium 2-SCSI_Dev001' -session 2014/01/07-9

Mount request confirmation sent to the Session Manager.

Friday, February 21, 2014

Commands to check version of DP client

The default port used by Data Protector Inet Service is '5555'.

You can find out the DP client version from any other server on the network with the following command.

# Telnet winclient01 5555
Trying...
Connected to winclient01.
Escape character is '^]'.
HP Data Protector A.06.11: INET, internal build DPWIN_00519, built on Thursday, May 26, 2011, 1:24 AM
Connection closed by foreign host.

C:\>omnicheck -version
HP Data Protector A.06.10: OMNICHECK, internal build DPWIN_00416, built on Friday, March 13, 2009, 5:37 AM

Thursday, February 20, 2014

TSM Admin Tasks

This post would give an idea of the commands and tasks that a backup administrator would carry-on everyday from TSM server end.

Ø q event * t=a begind=-1 endd=t

Ø BACKUP DB type=full dev=<device_class_name> wait=yes

Ø BACKUP DEVCONFIG

Ø BACKUP VOLHISTORY

Ø PREPARE source=dbbackup devclass=<device_class_name> wait=yes

Ø DELETE VOLHISTORY todate=today-7 type=all

Ø EXPIRE INVENTORY wait=yes duration=240

Ø RECLAIM STGPOOL <stgpool_name> th=75 du=120

(Repeat this command for all the storage pools)

Ø QUERY STGPOOL

Ø QUERY MOUNT

Ø QUERY SESSION

Ø select count(*) as Scratch, library_name from libvolumes where status='Scratch' group by library_name

(To check for scratch volumes count in each library)

Ø select source_name as source, destination_name as target,library_name as library, device from paths where online = 'NO'

(To check any offline paths)

Ø QUERY DRIVE

Ø QUERY ACTLOG s=error

Ø QUERY ACTLOG s=failure

Ø QUERY ACTLOG s="Server out of data storage space

[12:1165] Internal Database network communication error

Error:

[Critical] From: BSM@cellserver01.in.com "Backup_spec_01" Time: 2/19/2014 10:14:51 AM

[61:4001] Error accessing the database, in line 1490, file bcsmutil.c.

Database subsystem reports:

"[12:1165] Internal Database network communication error."

[Major] From: BSM@cellserver01.in.com "Backup_spec_01" Time: 2/19/2014 10:14:51 AM

[61:4001] Error accessing the database, in line 3073, file brsmutil.c.

Database subsystem reports:

"Internal error: DbaXXXX functions."

> This is related to RDS service/process running in the cell server which needs immediate attention. Whole site backup would fail if not actioned immediately.
> Restarting the DP services in cell server would eliminate these errors.

# omnisv status

ProcName Status [PID]
===============================
rds : Down
crs : Active [20740]
mmd : Active [20738]
kms : Active [20739]
omnitrig: Active
uiproxy : Active [20749]
Sending of traps disabled.
===============================
Status: At least one of Data Protector relevant processes/services is not running.

# omnisv stop

# omnisv start

This is just a workaround to bring the services up. Proper IDB maintenance and patch levels can reduce this situation. Check this link for detailed post on DP IDB Maintenance.

Sunday, February 16, 2014

[Critical] [90:63] Cannot load exchanger medium (Target drive is busy.)

It can be interesting in troubleshooting a device issue sometimes. Let's deal with the devices in this post..

Error Message:

[Normal] From: BMA@Mediasrv01.in.com "Drive01" Time: 2/13/2014 5:41:04 PM

By: UMA@Mediasrv01.in.com@Changer0:7:0:1

Loading medium from slot 2 to device Tape0:7:0:0C

[Critical] From: BMA@Mediasrv01.in.com " Drive01" Time: 2/13/2014 5:41:24 PM

[90:63] By: UMA@Mediasrv01.in.com@Changer0:7:0:1

Cannot load exchanger medium (Target drive is busy.)

From the error message, we could find that the "Target drive is busy".

>> Login to the media server from where the tape library/autoloader is connected

>> Run devbra -dev, the output would be the drives and exchanger detected from the device manager.

>> Use the uma along with the scsi path of exchanger (here it is Changer0:7:0:1) in cmd line (please see below).

>> Check if there is stuck tape in drive. Move the tape to a free slot either from GUI/CLI and run the backup.

>> there are chances that the library might be fully loaded and the drive too is holding a tape. So manually remove the tape from slot/drive.

Using UMA utility for device management:

Example of fully loaded autoloader which needs manually unload media from drive.

C:\uma -ioctl Changer0:7:0:1

Changer0:7:0:1> stat d

1 D1 Full "" ""

Changer0:7:0:1> stat s

1001 S1 Full "" ""

1002 S2 Full "" ""

1003 S3 Full "" ""

1004 S4 Full "" ""

1005 S5 Full "" ""

1006 S6 Full "" ""

1007 S7 Full "" ""

1008 S8 Full "" ""

(OR)

Example of unloading media from drive to empty slot.

C:\uma -ioctl Changer0:7:0:1

Changer0:7:0:1> stat d

1 D1 Full "" ""

Changer0:7:0:1> stat s

1001 S1 Full "" ""

1002 S2 Full "" ""

1003 S3 Empty "" ""

1004 S4 Full "" ""

1005 S5 Full "" ""

1006 S6 Full "" ""

1007 S7 Full "" ""

1008 S8 Full "" ""

Changer0:7:0:1> move D1 S3

Changer0:7:0:1> stat d

1 D1 Empty "" ""

>> This is one of many cases for 'Cannot load exchanger medium' error.

Thursday, February 13, 2014

Oracle Errors 'ORA-01031' during backup

[Normal] From: BSM@cellserver01.co.in "Oraclespec01" Time: 01/17/11 08:38:07

OB2BAR application on "Oraserver01.co.in" successfully started.

[Normal] From: ob2rman@Oraserver01.co.in "Oradb01" Time: 01/17/11 08:38:09

Starting backup of target database.

[Major] From: ob2rman@Oraserver01.co.in "Oradb01" Time: 01/17/11 08:38:10

The database reported error while performing requested operation.

ERROR:

ORA-01031: insufficient privileges

CONNECT: dporadb01/*****@Oradb01

[Major] From: ob2rman@Oraserver01.co.in "Oradb01" Time: 01/17/11 08:38:10

Backup of target database failed.

[Normal] From: BSM@cellserver01.co.in "Oraclespec01" Time: 01/17/11 08:38:11

OB2BAR application on "Oraserver01.co.in" disconnected.

Reason :

This error message "ORA-01031: insufficient privileges" means the authentication failure to access the database for backup.

The password file must be the problem, if not the DB username is not removed. The password file might have expired or changed or do not have permissions.

Solution :

Please have your DB team to remove/recreate the password file.

Command would look like the below:

ORAPWD FILE=filename [ENTRIES=numusers] [FORCE={Y|N}] [IGNORECASE={Y|N}] [NOSYSDBA={Y|N}]

Reference:

http://docs.oracle.com/cd/B28359_01/server.111/b28310/dba007.htm#ADMIN10241

Hope this helps!