Tuesday, June 12, 2007

RAC class, day 2

We had a long day today, so this will be significantly shorter than yesterday's update. The good news is that we had a ton of hands-on experience today; almost 80% pure labs and very little lecture. Just the way I like it. =) We installed and configured Clusterware, ASM, the Agent and the database software (but not the database, yet). Amazing stuff.

Yesterday's news about Grid Control R3 was extremely exciting, but when I got home I was disappointed. I specifically asked if he meant 10.2.0.3, and Andy specifically said "No, Release 3". Well, he referred us to oracle.com/enterprise_manager, and what do you see? 10.2.0.3. I installed 10.2.0.3 into our Production Grid Control 2 weeks ago. Oh well. At least it is encouraging to hear that the version we have installed is receiving so much good press. In fact, Andy is of the opinion that Release 3 is the only version worth getting. Not quite sure what that means.

Since we were heavy into labs today, I did not take many notes. Also, we are spending an extra hour at class these next couple days, not to mention that I am going in early tomorrow. Bottom line, short post so I can get to bed. *grin*

But here are some tidbits. This will probably be old hat for those who already have experience with ASM, and keeping in mind that we are working with Grid Control Release 2 (some of the things we did today are packaged much better in Release 3, or so I am told).

Order of installation:
- clusterware
- ASM
- Oracle Software
- Management Agent
- Database

It is possible to use sqlplus to shutdown all instances in a cluster, but unwise (use svrctl or GC instead). Shutting down each node individually is like an instance failure; read yesterday's post about remastering locks.

Since ASM only allows one datafile per tablespace, might as well use BIGFILEs.

As we were setting up user equilvalence (a fancy way of saying ssh key-pairs), I got to thinking "Why is this so complicated, so prone to error?" There is so much more that Oracle could do to automate the preparatory steps, even if only to provide a script to do it. Given all the automatic that Oracle Grid Control does (heck, you can provision a Gold Image unto bare metal!!), I am surprised these little things slip through the cracks sometimes.

RAC redo logs experience significantly more i/o than non-clustered databases. Andy recommends that the redo logs by physically isolated diskgroups. While you are at it, why not make them raw (since you are never going to back them up or recover them anyway, right?).

Back to Undo segments. Still on the thought that Undos are nothing more than Public Redo in a LMTS, Andy went to lenghts to demonstrate "how to eliminate Snapshot too old" errors. To boil it down and steal his thunder, the simple solution is to double the size of your UNDO tablespace. Or quadruple it. However, the longer answer involved more of Andy's famous pictures, and I learned a lot about Undo segments and how transactions use them. More later (hopefully).

As we were wrapping up, Andy talked about srvctl needs the name of the database as a parameter. This is because your OCR (Oracle Cluster Registry) is a solo act, there is only one; and the OCR has information for all the clustered databases on the node. However, I am thinking to myself, "Hmm... our development hosts have 40 or 50 databases, but sqlplus does not need the database name". Srvctl really came back with a vengeance; it is going to take us a while to get used to it. I think it is conceptually cool, but it adds an unwelcome layer of complexity.

Ciao

2 comments:

The Human Fly said...

It is easy to setup a RAC looking into the documentation when you have a server ready with all the patches ready and also network configuration.
If I am not wrong, ASM can be installed while database creation.

Jaffar

Charles Schultz said...

Yes, most of it is "easy", if you do things in exactly the right order. As K Gopal has said many times, if you follow the instruction relgiously, it is hard to go wrong.

We ran into some trickiness when we tried to remove the ASM from OCR. That turned out be really difficult, especially since we have not covered ASM or the CRS management tools (that will come on Friday). Even looking ahead in our books, we needed "external" help, which came in the form of a metalink note.