Wednesday, June 20, 2007

ceterum censeo

No, I am not advocating that Carthage must be destroyed. I think someone already beat me to that anyway.

But I did want to encapsulate what I can "bring home" from the RAC experience I had in Chicago. And I am obligated to include a sumptuous feast at Fogo de Chão. Vegetarians may wish to avert their eyes for the rest of this paragraph. I had heard that the salad bar was really good, stocked with fresh veggies, side dishes galore, deli meats, seafood, etc. I glanced at it once, from my chair on the other side of the room, and I saw a lot of colorful food. The rest of my attention was systematically focused on obtaining delicious cuts of meat from the passing "Gaúcho chefs". However, the, umm.... "chefs" were not really "Brazillian cowboys" at all (obviously), but it certainly sounds more impressive than "waiter". And since that is the worst of the cheese, I could live with that. But the meat! Wow! Slabs of hot, aromatic sirloin, lamb, pork, chicken or sausage passed by our table at least once a minute. They ask how you well done you want it, and they carve it off. And this is not your typical beef buffet; these are high quality cuts that are grilled in ways that Outback can only dream of. After a while, one can narrow the choices of what you like and dismiss other selections. The gaucho guys just keep coming back, again and again. I treated my dad for Father's Day; I decided that is the way to do it. =)

So, is that a good segue to bringing home the "meat" of the RAC class?

First and foremost, RAC is not a panacea. Yes, it scores a lot of cool points, but do not let that fool you into thinking it will solve your problems. The Oracle Sales people are pushing it hard because it is expensive, not because of how well it helps you attain your goals. If anything, RAC is probably most optimal for a niche market, customers who have applications that are well designed for parallelism, or at the very least, DML segregation.

After you swallow that pill, most everything else is rosy. One can work around performance issues (ie, a customer wannabe who thinks he is in the niche market) by decreasing or eliminating bottlenecks. Where will our bottlenecks be? That is probably one of the hardest questions to answer at this stage of the game, because a portion of the application is still being developed. Keep in mind that our initial foray into this field will be via a highly visible, yet low-load, online program. So, here are some items I think we should start with, so as to avoid having to worry about them in the future.

  1. As much as practical, maximize disk i/o for redo and controlfiles. Put them on independent, fast, highly available disks (perhaps DMX in RAID10).
  2. Provide expansion capabilities for the interconnect, either by allowing more (multiplexing) or swapping in a bigger pipe.
  3. In regards to the portions of the overall application that we have direct responsibility for, work hard to focus on making the DML either segregated or parallel. Do not merely copy Banner coding methods, which would have horrifying results in the long run.
  4. Be generous with the buffer cache
  5. We need to decide how we want to move forward with application failover. Is it good enough to implement TAF? Or do we go with the Cadillac of FAN (using ONS)? Personally, I think FAN is like asking a electromechanical physicist to invent an automated device to rapidly accelerate a 2mm thick, 4" x 4" sheet of nylon towards a living Musca domestica. Some people refer to that as a "fly swatter".

In the context of administration, I think our group has to prepare for the coming paradigm shift. We use Grid Control a little, mostly for monitoring a couple databases here and there. That is going to change with RAC, where all the "easy" tools are distributed in a not-so-user-friendly fashion throughout Enterprise Manager. Not only that, but we are going to have to get used the concept of connecting to an instance vs administrating the database. We will have to learn srvctl, crsctl, ocrconfig, crs_stat, .... you get the picture. RAC is not merely a database with two different memory regions running two different sets of background processes; we have Clusterware, OCFS2 and ASM to monkey with. RAC does not increase the number of headaches by a factor of 2. No, it is more ambitious than that. Try a factor of 10.

I wrote in an earlier entry that we should probably take advantage of Resource Manager. Emphasis on probably. As with any and all new technologies, we have to seriously consider our business needs, and determine if the business needs drive the requirements for said technology. I am of the opinion that any "business need" that is fashioned into a generic statement like "we need 5 9's" should be fed into a shredder, burned, dissolved in acid, boiled in plasma and sent on its merry way to Sol. Ergo, ceterum censeo.

With all the documentation we have (whitepapers from Oracle and Dell, reference manuals, purchased books, recommendations from consultants and special interest groups), I am confident that we will be able to deploy RAC using Best Practices. Deploying our application in such a manner is going to be a different story.


Noons said...

great post, Charles!
well worth reading, IMHO

Charles Schultz said...

Thanks for dropping by! Please let me know if you see any fallacies in what I write; helps me learn from my mistakes. *grin*

Anonymous said...

"we have Clusterware, OCFS2 and ASM to monkey with."

Ugh...raw disk NFS? :-) I know, I know...that was SPAM. But honestly, I still don't understand why folks subject themselves to so much pain!