Friday, October 26, 2007

Redo log gaps with RAW disk: There has got to be an easier way...

Of course, I admit right off the bat that we are still new with RAC, ASM and Raw disks. I have been asking around trying to find out what the "common practice" is when resolving the redo log gaps between a primary and standby database, specifically when using RAW disk. I think RAC just makes this scenario a little more complicated by the fact that you have two redo streams you have to keep an eye on.

So, here is the beef of my complaint. Let's say your archive log destinations are explicitly set for the standby SERVICE, letting the other destination "default" to the db_create_file_dest, which just happens to be the +DATA diskgroup under ASM (a RAW disk). As an aside, I am curious why it does not default to the db_recovery_file_dest, but that is a post for another day. Let's say your standby is down for whatever reason, and eventually you want to bring it back up. Oh yeah, there is no Data Guard, so forget about "automatic" gap resolution. How are you going to copy the archive logs from the primary to the standby?

There are no copy commands in ASM. At least, not for copying from a diskgroup to a device managed by a different file system, say, a cooked filesystem. One could use dd. But I would have to do a lot of reading to figure out what the offsets are, not to mention that the archive logs will most likely not be nicely laid out in contiguous blocks that you can gobble up with one dd command. The only option is to use RMAN. RMAN stands for Recovery Manager. You are going to use the Recovery Manager to copy files. Oh joy.

I am still searching for the best way to do this, but this is what I have now. Mostly for my own reference. =)

First, find the last log sequence number that was applied to the standby. Add 1. Call this minseq. If you have multiple instances, call this minseqN, where N = the instance number.

On the primary site:
rman target /
RMAN> backup archivelog from logseq :minseq;

Again, if using multiple instances:
rman target /
RMAN> backup archivelog from logseq :minseq1 thread 1;
...
RMAN> backup archivelog from logseq :minseqN thread N;

[ for now on, I will assume one instance because it is easier to type ]

Next, restore the archive logs to a cooked filesystem. In this case, I am going to use a standard ufs3 mount called /u02/backup.
RMAN> run {
2> set archivelog destination to '/u02/backup/';
3> restore archivelog from logseq :minseq;
4> }

Your archive logs are now "normal" files that can be accessed with normal OS commands. Copy them over to the standby site (we use scp). Wherever they end up on the standby site, I find it helpful to make a listing (including full path), one file per line. You will see why in a moment.

On the standby site, recover the standby database:
SQL > recover automatic standby database;

When you are prompted for a archive log, select the appropriate member from the list you made; you can copy the whole line and paste it into the prompt. Repeat until you cannot find any more entries on your list.

And Viola! You have resolved the archive log gap. Finally.


Some notes
If you know of a better way to do this, please let me know. =) I would love to do the whole thing from RMAN, but I have not figured out how to recover from specific archive logs; I keep getting the unhelpful "RMAN-06054: media recovery requesting unknown log:", even if I specify a archivelog destination. Obviously, I need to learn more about this aspect of RMAN.

I left out a small fact; we are not using a recovery catalog for this operation. Why not? In retrospect, it probably would have made things easier. But we are migrating away from a recovery catalog, so that is what we started with. If it turns out that using a recovery catalog is going to help us significantly, I would make a case to implement it. I believe it would have resolved the RMAN-06054 above, but I am not yet positive about that.

10 comments:

jason said...

hello,

I wonder why you went for storing the archive redo logs in ASM?

I'm running RAC and sending each nodes archived redo thread to the other node (cross instance archiving) and locally as well. That way each node on it's local cooked filesystem has the archived redo from both nodes.

If I ever need to resolve gaps on my standby i have the files for both nodes available. and if 1 node fails, I still have the archived redo for both nodes.

Raj said...

You mentioned that there is no DataGuard, so, no automatic cap
resolution. However, from what I could read/see, 10g onwards, it
is dataguard only.

If that's incorrect, please let
me know.

Charles Schultz said...

Jason,

We are scratching our heads about that as well. I would prefer to have a shared OCFS2 mount for archived logs. Still thinking about how we want to move forward with that.

Raj,

Whoops. I meant data guard broker. Yes, of course, data guard by itself is alive and well.

Thanks to Oracle for making this so confusing. *grin*

Anonymous said...

LOG_ARCHIVE_DEST_10 is implicitly set to USE_DB_RECOVERY_FILE_DEST (meaning that archived redo log files will be sent to the flash recovery area) if you create a recovery area and do not set any other local archiving destinations.

---
DG primary will use LOG_ARCHIVE_DEST_10 for local archiving.

Anonymous said...

can you use this to transfer files?

http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_ftran.htm#ARPLS095

This is available in 10g also.

Charles Schultz said...

anon, in your last comment, the url is truncated. 'Tis a blogspot bug, I believe.

Charles Schultz said...

One of our DBAs just email an workaround using DBMS_FILE_TRANSFER.COPY_FILE. Works quite well.

Anonymous said...

Check Oracle documentation and search for FAL.

You can also enable ftp access to ASM - look for "XML DB Virtual Folder"

Charles Schultz said...

Anon, FAL works sometimes but with no Broker, it ain't exactly foolproof. Besides, it does not resolve all the issues in the post.

The XML Virtual Folder seems rather interesting, but also rather limited; it has been a while since we used FTP, but that seems like a good place to start if we were to go that route. Have you seen any other more modern APIs for Virtual Folder, like basic shell commands via ssh/scp?

Thanks again. You have given me something to research a little.

Anonymous said...

Hi there,

Thanks for the note. I had another try for copying ASM archive log to disk.. Here's the log,

RMAN> copy archivelog '+ORARECV1_DG/inpub/archivelog/2008_07_18/thread_1_seq_2829.2686.660426029' to '/staging/oracle/peterh/tt.arc';

Starting backup at 18-JUL-08
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=123 instance=INPUB1 devtype=DISK
channel ORA_DISK_1: starting archive copy
input archive log thread=1 sequence=2829 recid=5317 stamp=660426030
output filename=/staging/oracle/peterh/tt.arc recid=5389 stamp=660435035
channel ORA_DISK_1: archivelog copy complete, elapsed time: 00:00:02
Finished backup at 18-JUL-08

Cheers,
Peter