Friday, October 26, 2007

I hate comment spam

I finally made it to the state where I qualify for comment spam. GRRRR. Where is the "do not call" list?

My apologies for having to turn on capcha.

Redo log gaps with RAW disk: There has got to be an easier way...

Of course, I admit right off the bat that we are still new with RAC, ASM and Raw disks. I have been asking around trying to find out what the "common practice" is when resolving the redo log gaps between a primary and standby database, specifically when using RAW disk. I think RAC just makes this scenario a little more complicated by the fact that you have two redo streams you have to keep an eye on.

So, here is the beef of my complaint. Let's say your archive log destinations are explicitly set for the standby SERVICE, letting the other destination "default" to the db_create_file_dest, which just happens to be the +DATA diskgroup under ASM (a RAW disk). As an aside, I am curious why it does not default to the db_recovery_file_dest, but that is a post for another day. Let's say your standby is down for whatever reason, and eventually you want to bring it back up. Oh yeah, there is no Data Guard, so forget about "automatic" gap resolution. How are you going to copy the archive logs from the primary to the standby?

There are no copy commands in ASM. At least, not for copying from a diskgroup to a device managed by a different file system, say, a cooked filesystem. One could use dd. But I would have to do a lot of reading to figure out what the offsets are, not to mention that the archive logs will most likely not be nicely laid out in contiguous blocks that you can gobble up with one dd command. The only option is to use RMAN. RMAN stands for Recovery Manager. You are going to use the Recovery Manager to copy files. Oh joy.

I am still searching for the best way to do this, but this is what I have now. Mostly for my own reference. =)

First, find the last log sequence number that was applied to the standby. Add 1. Call this minseq. If you have multiple instances, call this minseqN, where N = the instance number.

On the primary site:
rman target /
RMAN> backup archivelog from logseq :minseq;

Again, if using multiple instances:
rman target /
RMAN> backup archivelog from logseq :minseq1 thread 1;
...
RMAN> backup archivelog from logseq :minseqN thread N;

[ for now on, I will assume one instance because it is easier to type ]

Next, restore the archive logs to a cooked filesystem. In this case, I am going to use a standard ufs3 mount called /u02/backup.
RMAN> run {
2> set archivelog destination to '/u02/backup/';
3> restore archivelog from logseq :minseq;
4> }

Your archive logs are now "normal" files that can be accessed with normal OS commands. Copy them over to the standby site (we use scp). Wherever they end up on the standby site, I find it helpful to make a listing (including full path), one file per line. You will see why in a moment.

On the standby site, recover the standby database:
SQL > recover automatic standby database;

When you are prompted for a archive log, select the appropriate member from the list you made; you can copy the whole line and paste it into the prompt. Repeat until you cannot find any more entries on your list.

And Viola! You have resolved the archive log gap. Finally.


Some notes
If you know of a better way to do this, please let me know. =) I would love to do the whole thing from RMAN, but I have not figured out how to recover from specific archive logs; I keep getting the unhelpful "RMAN-06054: media recovery requesting unknown log:", even if I specify a archivelog destination. Obviously, I need to learn more about this aspect of RMAN.

I left out a small fact; we are not using a recovery catalog for this operation. Why not? In retrospect, it probably would have made things easier. But we are migrating away from a recovery catalog, so that is what we started with. If it turns out that using a recovery catalog is going to help us significantly, I would make a case to implement it. I believe it would have resolved the RMAN-06054 above, but I am not yet positive about that.