Pages

Sunday, January 27, 2019

Backup and Recovery Guidelines For Online Patching (adop) Cutover Failed

Backup and Recovery Guidelines For Online Patching (adop) Cutover Failed


If a cutover error occurs, you should first check the error message and try to determine if the problem can be fixed easily, or (as is true in many cases) cutover can be made to succeed simply by running the command again. Restoring to a point before cutover via Flashback recovery should only be done when the error cannot easily be fixed, and continues to fail on subsequent cutover attempts.

Before proceeding further with the instructions in this document:

1.Review failure messages and cutover logs, identify problems, and make corrections as applicable. Issues such as running out of disk space can be corrected easily, whilst issues such as timeouts, deadlocks, and network issues may prove to be transient.
2.Retry the cutover command.
3.If cutover still fails, follow the instructions in the rest of this document to restore system availability while you take further diagnostic and corrective actions.

If after cutover you want to revert to the state of the system before the patching cycle was started, you can use the Oracle Database Flashback feature to go back to a designated point in time (a restore point). You should create the restore point just before running the cutover phase. Depending on exactly when the failure occurred, you may also need to restore the application tier file systems.

Note: Before creating the restore point, it is advisable to issue a suitable downtime notification and shut down the web services. This will ensure you do not lose any transactional data, and in effect simply extends slightly the cutover downtime.

Setting Up Flashback:

1.Set ARCHIVELOG mode

2.Enable Fast/Flash Recovery Area
You enable the Fast Recovery Area (FRA) by setting two database initialization parameters:
DB_RECOVERY_FILE_DEST_SIZE - Specifies the size of the Fast Recovery Area.
DB_RECOVERY_FILE_DEST - Specifies the physical location of the Flashback recovery files.

3.Specify maximum flashback time
You enable the Flashback time by setting below parameter:
alter system set db_flashback_retention_target=120;

Note: The amount of retention time and space needed will be governed by the amount of time required for cutover. Setting the flashback retention target too high may result in issues if DB_RECOVERY_FILE_DEST_SIZE is set to a large value.

4. Activate Flashback

5.Create restore point
Create a restore point called BEFORE_CUTOVER. As shown in the example below, it is also recommended to force a logfile switch both before and after the restore point is created.
SQL>alter system switch logfile;
System altered.
SQL>create restore point BEFORE_CUTOVER guarantee flashback database;
Restore point created.
SQL>alter system switch logfile;
System altered.

Note: As noted under the FRA description, the Online Patching cutover phase should be scheduled for a time when there are few online transactions and batch processing is minimal. You should confirm that critical concurrent requests are not executing during cutover.  You should also consider putting scheduled concurrent requests on hold prior to creating the BEFORE_CUTOVER flashback restore point.

Scenario
You are running an Online Patching cycle:
$ adop phase=prepare
...
$ adop phase=apply patches=11111111,22222222
...
$ adop phase=finalize
...
$ adop phase=cutover
Cutover fails, and you need to go back to the state of the system before you ran the cutover phase.

Note: If you had not run the cutover phase, you would have been able to roll back the patch application process by running the adop abort phase. However, this is not possible once cutover has been run.

There are two main parts to the restore procedure:

1.You will at least need to restore the database using the Flashback feature.
2.Depending on when cutover failed, you may also need to restore the application tier file systems.

Flashing Back the Database
----------------------------------:

1.First, shut down the database, then start it up in mount state:
SQL>shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>startup mount
ORACLE instance started.

2.Restore the flashback to the specified restore point:
SQL>flashback database to restore point BEFORE_CUTOVER;
Flashback complete.

3.Start the database in read-only mode:
SQL>alter database open read only;
Database altered.
Check all looks as expected.

4.Shut down the database, start it up in mount state, then open it with the resetlogs option:
SQL>shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>startup mount
ORACLE instance started.
Total System Global Area 2142679040 bytes
Fixed Size 1346140 bytes
Variable Size 520095140 bytes
Database Buffers 1593835520 bytes
Redo Buffers 27402240 bytes
Database mounted.
SQL>alter database open resetlogs;
Database altered.

5.Disable flashback:
SQL>alter database flashback off;
Database altered.

6.Drop the restore point:
SQL>drop restore point BEFORE_CUTOVER;
Restore point dropped.

7.Set recovery file destination:
SQL>alter system set db_recovery_file_dest='';
System altered.

8.Confirm that Flashback has been deactivated:
SQL>select FLASHBACK_ON from v$database;
FLASHBACK_ON
------------
NO

Restoring the File Systems
---------------------------------:

Whether you need to perform this step is conditional, depending on whether cutover failed before the file systems were switched. You can identify which of these cases applies by referring to the cutover logs in $NE_BASE/EBSapps/log/adop/<current_session_id>/cutover_<timestamp>/ for your current session id.

Case 1 - If the log messages indicate that cutover failed before the file systems were switched, do a clean shutdown of any services that are running. Then restart all the services using the normal startup script, and go to Section 6.

Section 6. Options and Next Steps
After the restore is complete, you have two basic options for proceeding:

* Abort the current patching cycle, if the issue that required you to restore was caused by the patches you were attempting to apply.
* Identify and fix any other issues in the current patching cycle, and proceed with patching.

Case 2 - If the log messages indicate that cutover failed after the file systems were switched, follow Step 5.1 to shut down any services that have started from the new run file system, then follow Step 5.2 to switch the file systems back. After that, go to Section 6.

Section 5.1 Shut down services started from new run file system

* Source the environment on the new run file system.
* From $ADMIN_SCRIPTS_HOME, shut down all the services (using adstpall.sh on UNIX).
* In a multi-node environment, repeat the preceding two steps on all nodes, leaving the admin node until after all the slave nodes.

Section 5.2 Switch file systems back

* On all nodes where file systems have been switched, run the following command to switch the file systems back:
$ perl $AD_TOP/patch/115/bin/txkADOPCutOverPhaseCtrlScript.pl \
-action=ctxupdate \
-contextfile=<full path to new run context file> \
-patchcontextfile=<full path to new patch file system context file> \
-outdir=<full path to out directory>
* Start up all services from the old run file system (using adstrtal.sh on UNIX).
* In a multi-node environment, repeat the preceding two steps on all nodes, starting with the admin node and then proceeding to the slave nodes.

Section 6. Options and Next Steps
After the restore is complete, you have two basic options for proceeding:

* Abort the current patching cycle, if the issue that required you to restore was caused by the patches you were attempting to apply.
* Identify and fix any other issues in the current patching cycle, and proceed with patching.

Reference metalink Doc ID 1584097.1

R12 Cloning Issues (12.1.1 & 12.1.3)

R12 Cloning Issues (12.1.1 & 12.1.3) 


1. log/out files are not getting generated:
.stop all concurrent services
.Navigate to profile/system
.search for %RRA Enabled%
.set the value to “yes” and save it
.start the concurrent services
.test the issue

2. Concurrent Managers are not getting up.
1. CLeaned up the nodes from FND_NODES by using EXEC FND_CONC_CLONE.SETUP_CLEAN;
2. Ran autoconfig on all tier ( DB first then CM and then Web node ) checked node entries again. then started CM.
3. Also entries in fnd_concurrent_queues (NODE_NAME,TARGET_NODE) and then checked.

3.  apps listener is not getting up ?
First check log file of apps listener to find root cause at adalnctl.txt
1. $OAD_TOP/admin/log/$CONTEXT_NAME/adalnctl.txt
2. $ORACLE_HOME/network/admin/apps_$SID.log  (on Middle Tier)
1. Failure to create proper apps listener file via adgentns.sh during Rapid Clone process
2. FNDSM (Service Manager issues) FNDSM Entries missing in tnsnames.ora.

4. RC-00110: Fatal: Error occurred while relinking of ApplyDBTechStack
Drop softlink and recreate it pointing to target ORACLE_HOME


5. Oracle APPS R12 Post Cloning issue - Form not launching :

1- Stopped the APPS Tier services.
2- rename the directory "tldcache" under following directories.
a- $ORA_CONFIG_HOME/10.1.3/j2ee/oafm/tldcache
b- $ORA_CONFIG_HOME/10.1.3/j2ee/oacore/tldcache
c- $ORA_CONFIG_HOME/10.1.3/j2ee/forms/tldcache
3- create the emplty directory with the name "tldcache" under the above directories.
4- restart the APPS Tier services.
5- test the issue with starting some forms.

6 . R12 Login issue on target after cloning

1. cd $FND_TOP/patch/115/bin
2. perl ojspCompile.pl --compile --flush -p 2
3. Run autoconfig on both db and apps tiers
4. Bring up the services and test login

Oracle E-Business Suite Cloning Questions & Answers (12.1.1 & 12.1.3)

Oracle E-Business Suite Cloning Questions & Answers (12.1.1 & 12.1.3)


1. What is the location of adpreclone.pl in oracle database?
$ORACLE_HOME/appsutil/scripts/$CONTEXT_NAME

2. What is the location of adcfgclone.pl in oracle database?
$ORACLE_HOME/appsutil/clone/bin

3. What is the location of adpreclone.pl in oracle application?
$INST_TOP/admin/scripts

4. What is the location of adcfgclone.pl for applmgr user?
$COMMON_TOP/clone/bin

5. How often do you clone?
Cloning happens weekly or monthly depending on the organization requirement.

6. When do you run adpreclone on Production?
If any changes made to either TechStack,database or any patches applied.

7. When we run perl adpreclone.pl dbTier why it requires apps password?
It requires a database connection to validate apps schema.

8. When we run perl adpreclone.pl appTier why it will not prompt for apps password?
It doesn’t require db a connection.

9. What happens after running adcfgclone.pl?

It will take inputs and creates xml file.
After creating xml file,it runs autoconfig.
It registers the ORALCE_HOME with GLOBAL INVENTORY.

10. What adpreclone.pl does on appsTier?
It will collects all the information about the source system, creates a cloning stage area, and generates templates and drivers. All of these are to reconfigure the instance on a Target machine.

11. What Perl adcfgclone.pl dbTechStack do?

Create context file
Register ORACLE_HOME
Relink ORACLE_HOME
Configure ORACLE_HOME
Start SQL*NET listener

12. What Perl adcfgclone.pl dbTier do?

Create context file.
Register ORACLE_HOME
Relink ORACLE_HOME
Configure ORACLE_HOME
Recreate controlfile
Configure database
Start SQL*NET listener


13. What Perl adcfgclone.pl appsTier do?

Create context file
Register ORACLE_HOME
Relink ORACLE_HOME
Configure ORACLE_HOME
Create INST_TOP
Configure APPL_TOP
Start Apps Processes.

14. When we run adcfgclone.pl which script it will call?
It will call adclone.pl which is located at $AD_TOP/bin .

15. What are the parameters are mandatory in RMAN DUPLICATE 11g?

db_file_name_convert
log_file_name_convert

What is the difference between RPO and RTO in DR Dril

What is the difference between RPO and RTO in DR Dril ?


Recovery Point Objective(RPO) is the amount of data you can afford to lose, if a server had a failure. It depends on the backup strategies in your Organization

Recovery Time Objective(RTO) is the time that it could take to get your systems back up and running after a failure.

Thursday, January 24, 2019

How to Resolve ORA-00257:Archiver Error

How to Resolve ORA-00257:Archiver Error ?


Solution:

1) Check if there are any errors for the archive destination(s). Make sure that the number of VALID archive destinations is greater than or equal to the value specified by LOG_ARCHIVE_MIN_SUCCEED_DEST initialization parameter.

SELECT dest_id, dest_name, binding, status, destination, error FROM v$archive_dest;

SHOW PARAMETER log_archive_min_succeed_dest;

2) If space is full in one or more of the archive destinations (or if destination is not available), take any of the following steps:

2.a) Manually move the archives to another location and delete them from archive destination.

(OR)

2.b) Change the archive destination to an alternate archive destination which has space available.
SQL> alter system set log_archive_dest_1='LOCATION=<alternate location path>';
 
(OR)

2.c) If there are VALID archive destinations but are less than the value specified by LOG_ARCHIVE_MIN_SUCCEED_DEST, then set this parameter to a lower value:
SQL> alter system set log_archive_min_succeed_dest=<>;
 
(OR)

2.d) Backup the archives and delete them using RMAN command - BACKUP ARCHIVELOG with DELETE INPUT; (Doc ID 388422.1)

3) In case, archiving does not resume even after freeing up space in archive destination, then probably archiver is stuck.In such case, issue the following command as per (Doc ID 121927.1), for each archive destination to resume automatic archiving:

alter system set LOG_ARCHIVE_DEST_.. = 'location=/<archive log path> reopen';

When Flash Recovery Area is in use

i) Check if flash_recovery_area is in use.

archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive Destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 384
Next log sequence to archive 386
Current log sequence 386
ii) To immediately resolve this issue, allow more space in the DB_RECOVERY_FILE_DEST with the DB_RECOVERY_FILE_DEST_SIZE parameter :

SQL> alter system set db_recovery_file_dest_size=<>G ;
OR

iii) To avoid the situation once DB_RECOVERY_FILE_DEST_SIZE is reached , specify and alternate location
(archiving is automatically performed to the alternate dest2) :

log_archive_dest_1='LOCATION=use_db_recovery_file_dest NOREOPEN ALTERNATE=LOG_ARCHIVE_DEST_2'
log_archive_dest_2='LOCATION=/other_dest_for_archiving'
log_archive_dest_state_1='enable'
log_archive_dest_state_2='alternate'
db_recovery_file_dest='/u01/app/oracle/product/10.1.0/db_1/flash_recovery_area'
db_recovery_file_dest_size=2G

Reference metalink Doc ID 2014425.1

Top 10 Backup and Recovery Best Practices

Top 10 Backup and Recovery Best Practices 


Top 10 Backup and Recovery Best Practices ?

This document assumes that you are doing the Backup and Recovery basics
* Running in Archivelog mode
* Multiplexing the controlfile
* Taking regular backups
* Periodically doing a complete restore to test your procedures.
* Restore and recovery validate will not uncover nologging issues. Consider turning on force*logging if they need all transactions to be recovered, and not face nologging problems ( ALTER DATABASE FORCE LOGGING; )

1. Turn on block checking.

The aim is to detect, very early the presence of corrupt blocks in the database. This has a slight performance overhead, but will allow Oracle to detect early corruption caused by underlying disk, storage system, or I/O system problems.
SQL> alter system set db_block_checking = true scope=both;

2. Turn on Block Change Tracking tracking when using RMAN incremental backups (10g and higher)

The Change Tracking File contains information that allows the RMAN incremental backup process to avoid reading data that has not been modified since the last backup. When Block Change Tracking is not used, all blocks must be read to determine if they have been modified since the last backup.
SQL> alter database enable block change tracking using file '/u01/oradata/ora1/change_tracking.f';

3. Duplex redo log groups and members and have more than one archive log destination.

If an archivelog is corrupted or lost, by having multiple copies in multiple locations, the other logs will still be available and could be used.
If an online log is deleted or becomes corrupt, you will have another member that can be used to recover if required.
SQL> alter system set log_archive_dest_2='location=/new/location/archive2' scope=both;
SQL> alter database add logfile member '/new/location/redo21.log' to group 1;

4. When backing up the database with RMAN use the CHECK LOGICAL option.

This will cause RMAN to check for logical corruption within a block, in addition to the normal checksum verification. This is the best way to ensure that you will get a good backup.
RMAN> backup check logical database plus archivelog delete input;

5. Test your backups.

This will do everything except actually restore the database. This is the best method to determine if your backup is good and usable before being in a situation where it is critical and issues exist.
If using RMAN this can be done with:
RMAN> restore validate database;

6. When using RMAN have each datafile in a single backup piece

When doing a partial restore RMAN must read through the entire piece to get the datafile/archivelog requested. The smaller the backup piece the quicker the restore can complete. This is especially relevent with tape backups of large databases or where the restore is only on individual / few files.

However, very small values for filesperset will also cause larger numbers of backup pieces to be created, which can reduce backup performance and increase processing time for maintenance operations. So those factors must be weighed against the desired restore performance.
RMAN> backup database filesperset 1 plus archivelog delete input;

7. Maintain your RMAN catalog/controlfile

Choose your retention policy carefully. Make sure that it complements your tape subsystem retention policy, requirements for backup recovery strategy. If not using a catalog, ensure that your CONTROL_FILE_RECORD_KEEP_TIME parameter matches your retention policy.
SQL> alter system set control_file_record_keep_time=21 scope=both;
This will keep 21 days of backup records in the control file.

Run regular catalog maintenance.
REASON: Delete obsolete will remove backups that are outside your retention policy.
If obsolete backups are not deleted, the catalog will continue to grow until performance
becomes an issue.

RMAN> delete obsolete;

REASON: crosschecking will check that the catalog/controlfile matches the physical backups.
If a backup is missing, it will set the piece to 'EXPIRED' so when a restore is started,
that it will not be eligible, and an earlier backup will be used. To remove the expired
backups from the catalog/controlfile use the delete expired command.
RMAN> crosscheck backup;
RMAN> delete expired backup;

8. Prepare for loss of controlfiles.

This will ensure that you always have an up to date controlfile available that has been taken at the end of the current backup, rather then during the backup itself.
RMAN> configure controlfile autobackup on;
keep your backup logs
REASON: The backup log contains parameters for your tape access, locations on controlfile backups
that can be utilised if complete loss occurs.

9. Test your recovery
REASON: During a recovery situation this will let you know how the recovery will go without actually doing it, and can avoid having to restore source datafiles again.
SQL> recover database test;

10. In RMAN backups do not specify 'delete all input' when backing up archivelogs

REASON: Delete all input' will backup from one destination then delete both copies of the archivelog where as 'delete input' will backup from one location and then delete what has been backed up. The next backup will back up those from location 2 as well as new logs from location 1, then delete all that are backed up. This means that you will have the archivelogs since the last backup available on disk in location 2 (as well as backed up once) and two copies backup up prior to the previous backup.


Reference Metalink Doc ID 388422.1