Pages

Sunday, January 27, 2019

Backup and Recovery Guidelines For Online Patching (adop) Cutover Failed

Backup and Recovery Guidelines For Online Patching (adop) Cutover Failed


If a cutover error occurs, you should first check the error message and try to determine if the problem can be fixed easily, or (as is true in many cases) cutover can be made to succeed simply by running the command again. Restoring to a point before cutover via Flashback recovery should only be done when the error cannot easily be fixed, and continues to fail on subsequent cutover attempts.

Before proceeding further with the instructions in this document:

1.Review failure messages and cutover logs, identify problems, and make corrections as applicable. Issues such as running out of disk space can be corrected easily, whilst issues such as timeouts, deadlocks, and network issues may prove to be transient.
2.Retry the cutover command.
3.If cutover still fails, follow the instructions in the rest of this document to restore system availability while you take further diagnostic and corrective actions.

If after cutover you want to revert to the state of the system before the patching cycle was started, you can use the Oracle Database Flashback feature to go back to a designated point in time (a restore point). You should create the restore point just before running the cutover phase. Depending on exactly when the failure occurred, you may also need to restore the application tier file systems.

Note: Before creating the restore point, it is advisable to issue a suitable downtime notification and shut down the web services. This will ensure you do not lose any transactional data, and in effect simply extends slightly the cutover downtime.

Setting Up Flashback:

1.Set ARCHIVELOG mode

2.Enable Fast/Flash Recovery Area
You enable the Fast Recovery Area (FRA) by setting two database initialization parameters:
DB_RECOVERY_FILE_DEST_SIZE - Specifies the size of the Fast Recovery Area.
DB_RECOVERY_FILE_DEST - Specifies the physical location of the Flashback recovery files.

3.Specify maximum flashback time
You enable the Flashback time by setting below parameter:
alter system set db_flashback_retention_target=120;

Note: The amount of retention time and space needed will be governed by the amount of time required for cutover. Setting the flashback retention target too high may result in issues if DB_RECOVERY_FILE_DEST_SIZE is set to a large value.

4. Activate Flashback

5.Create restore point
Create a restore point called BEFORE_CUTOVER. As shown in the example below, it is also recommended to force a logfile switch both before and after the restore point is created.
SQL>alter system switch logfile;
System altered.
SQL>create restore point BEFORE_CUTOVER guarantee flashback database;
Restore point created.
SQL>alter system switch logfile;
System altered.

Note: As noted under the FRA description, the Online Patching cutover phase should be scheduled for a time when there are few online transactions and batch processing is minimal. You should confirm that critical concurrent requests are not executing during cutover.  You should also consider putting scheduled concurrent requests on hold prior to creating the BEFORE_CUTOVER flashback restore point.

Scenario
You are running an Online Patching cycle:
$ adop phase=prepare
...
$ adop phase=apply patches=11111111,22222222
...
$ adop phase=finalize
...
$ adop phase=cutover
Cutover fails, and you need to go back to the state of the system before you ran the cutover phase.

Note: If you had not run the cutover phase, you would have been able to roll back the patch application process by running the adop abort phase. However, this is not possible once cutover has been run.

There are two main parts to the restore procedure:

1.You will at least need to restore the database using the Flashback feature.
2.Depending on when cutover failed, you may also need to restore the application tier file systems.

Flashing Back the Database
----------------------------------:

1.First, shut down the database, then start it up in mount state:
SQL>shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>startup mount
ORACLE instance started.

2.Restore the flashback to the specified restore point:
SQL>flashback database to restore point BEFORE_CUTOVER;
Flashback complete.

3.Start the database in read-only mode:
SQL>alter database open read only;
Database altered.
Check all looks as expected.

4.Shut down the database, start it up in mount state, then open it with the resetlogs option:
SQL>shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>startup mount
ORACLE instance started.
Total System Global Area 2142679040 bytes
Fixed Size 1346140 bytes
Variable Size 520095140 bytes
Database Buffers 1593835520 bytes
Redo Buffers 27402240 bytes
Database mounted.
SQL>alter database open resetlogs;
Database altered.

5.Disable flashback:
SQL>alter database flashback off;
Database altered.

6.Drop the restore point:
SQL>drop restore point BEFORE_CUTOVER;
Restore point dropped.

7.Set recovery file destination:
SQL>alter system set db_recovery_file_dest='';
System altered.

8.Confirm that Flashback has been deactivated:
SQL>select FLASHBACK_ON from v$database;
FLASHBACK_ON
------------
NO

Restoring the File Systems
---------------------------------:

Whether you need to perform this step is conditional, depending on whether cutover failed before the file systems were switched. You can identify which of these cases applies by referring to the cutover logs in $NE_BASE/EBSapps/log/adop/<current_session_id>/cutover_<timestamp>/ for your current session id.

Case 1 - If the log messages indicate that cutover failed before the file systems were switched, do a clean shutdown of any services that are running. Then restart all the services using the normal startup script, and go to Section 6.

Section 6. Options and Next Steps
After the restore is complete, you have two basic options for proceeding:

* Abort the current patching cycle, if the issue that required you to restore was caused by the patches you were attempting to apply.
* Identify and fix any other issues in the current patching cycle, and proceed with patching.

Case 2 - If the log messages indicate that cutover failed after the file systems were switched, follow Step 5.1 to shut down any services that have started from the new run file system, then follow Step 5.2 to switch the file systems back. After that, go to Section 6.

Section 5.1 Shut down services started from new run file system

* Source the environment on the new run file system.
* From $ADMIN_SCRIPTS_HOME, shut down all the services (using adstpall.sh on UNIX).
* In a multi-node environment, repeat the preceding two steps on all nodes, leaving the admin node until after all the slave nodes.

Section 5.2 Switch file systems back

* On all nodes where file systems have been switched, run the following command to switch the file systems back:
$ perl $AD_TOP/patch/115/bin/txkADOPCutOverPhaseCtrlScript.pl \
-action=ctxupdate \
-contextfile=<full path to new run context file> \
-patchcontextfile=<full path to new patch file system context file> \
-outdir=<full path to out directory>
* Start up all services from the old run file system (using adstrtal.sh on UNIX).
* In a multi-node environment, repeat the preceding two steps on all nodes, starting with the admin node and then proceeding to the slave nodes.

Section 6. Options and Next Steps
After the restore is complete, you have two basic options for proceeding:

* Abort the current patching cycle, if the issue that required you to restore was caused by the patches you were attempting to apply.
* Identify and fix any other issues in the current patching cycle, and proceed with patching.

Reference metalink Doc ID 1584097.1

No comments:

Post a Comment