Monday, November 27, 2006

Wow

What else can I say. =) It would be a miracle if I passed. Wow. In my defense, our test started about 5 hours later than the stated, published time. And for those who have been through that grueling experience, you know that such a time schedule would amplify the difficulty. If I do pass, I will have enormous bragging rights! But, believe you me, I did not pass.

Friday, November 17, 2006

Getting ready for the plunge

This will be my last post before I sit for the OCM. I have mixed feelings about it - the 9i OEM tools are littered with bugs and holes, and while the tool (if you can call it that) may be good for some specific scenarios, I have come to detest it. Especially the java interface, which is cheapened even more on the RedHat platform (some buttons are truncated or all-together missing). On the good side, I have gotten to know RMAN a whole lot better. Reading Robert Freeman's book (and running into him on oracle-l) has been good, and Oracle has improved RMAN so much since 8i that I am actually impressed. It took them a while, but they finally come out with something that is more helpful than painful.

Another thing that concerns me is that I am not a good test-taker. It helps tremendously that this particular Practicum will be scenario based - it will almost be like "normal work". Except no socializing, no music, no Google, and the snacks are .... well, it will sustain me at least. So, not quite like work, but the actual "doing it" will be.

I also look forward to networking. There will be 6 others taking the Practicum with me, and it will be interesting to hear about their backgrounds. And assuming I pass, I am excited about meeting the community of Masters out there. If I do not pass, I will just have to wait a bit longer. *grin*

Based on the public information available about the Practicum, and anticipating a NDA/gag-order, here is what I believe the test will be like (all of this is from memory):

  • Primary focus on "normal" activities
    • install software (not sure how much of this we will have to do, since it takes time)
    • setup environment
      • setup OEM
      • setup RMAN
      • setup/configure database (different flavors, different purposes)
      • setup network files
    • backup/recovery
      • "recover from any failure scenario" is rather intimidating
      • Time is an important resource, so a cold backup every hour is not going to work
    • tune performance
    • manage database
      • add/drop tablespaces, tables, objects
      • adjust storage specs as needed
  • Secondary focus on specific features
    • shared/dedicated servers
    • standby database
    • partitioned tables
    • data access
    • security, including vpd/fgac
    • auditing
  • Tertiary focus on more advanced things that do not really work well (the GUI sucks!)
    • Advanced Replication
    • Resource Management
    • Possibly Dataguard
It is my theory that if you can do the Primary things perfectly, the Secondary things fairly, than the last group is completely optional. I could be wrong, but this is what my intuition tells me. I guess we will see how good that semi-conscious knowledge is. It sounds like backup/recovery might be a big issue. I have been practicing on a really small database, which means that a full backup and restore does not take that long. In the real world, we do incremental backups, but a recovery from an incremental may still have to restore the full backups. Logical corruption will be a bit more interesting - I plan to use db_block_checking and rman's "check logical" to help in that regards, but there still may be things like dropped data (flashback!) or dropped tables and there may be a requirement to recover just that object. On a larger database, that will be interesting. I am thinking that the storage is moderate; it will be a medium sized database (perhaps 20-50 gb), with enough room for a couple full backups, but not much more. If the database is smaller, that would be awesome - smaller means faster backups and easier to manipulate. Perhaps the database will start small but grow as the days goes on.

There is a blurb in the Oracle documentation about the Practicum that says one should use so-called "best practices" and Oracle tools to speed things up, because you are on the clock. I feel pretty confident that I can do most things reasonably fast, but I do have doubts in the back of my head. I can setup a standby database using rman (and using the same technology, create a clone for replication or whatever else). I can use OEM for most of the basic tasks, but I worry about how fast that will actually be, as opposed to just using the command-line which I am more comfortable with. For instance, one of the bullet points for the exam prep is "Use OEM to modify a database configuration". You mean like "alter database set optimizer_index_caching = 80"? Do you realize how many button clicks that is? 8, with a wall clock time of 83 seconds (which obviously depends on how fast your computer is).

In closing, I also wonder how different the 10g OCM is. Could I take that now? I have been working with 10g for a little while now, and know my way around EM/Grid Control. In some respects, it is much better. One of the downsides is that the new EM is so comprehensive that one can easily get lost if you are in unfamiliar territory. See my previous posts about trying to manipulate the Maintenance Window. *grin* I am not going to switch my course of action now - sticking with 9i. But I do wonder... perhaps I can get a steep discount and take the 10g OCM in the near future.

And finally, I remind myself that there are many many smart people out there who have not even taken the OCM just because they do not see a good reason for it. If/when I am a fully certified Master, I still realize my low post on the totem pole. The folks on oracle-l, askTom, the oracle forums (and even the dwellers in metalink Forums) and a slew of other places, are extremely smart and I learn from them on a daily basis. I hope to be able to contribute something, but just because of some pretentious title does not mean that I know everything.

Friday, November 10, 2006

Buggy bugs

This past Sunday (05-NOV-2006) we installed patch 4752541 (Intermittent PLS-306 / ORA-1722 / ORA-1858 under load) to fix a problem our webapp was having on overloaded procedure calls. We were looking golden up until Wednesday when our Production system starting spiking on library cache latch waits (and log file sync waits were in there as well). Being good little Oracle DBAs, we filed a case with Oracle Support, and learned a bit, but nothing really concrete. That evening we hit another major slowdown. It was decided to yank out the Sunday patch because it did exhibit some relationship and we thought it best to be safe. So here we are, Friday, with no more critical slow downs but still hitting the original webapp errors. We now have 3 SRs open with Oracle, 4 engineers working on them, two "team leads"/duty managers keeping tabs, and a cell-phone number for the Director of Oracle Support.

All this to say that Oracle is really, very complex. We still do not even know for sure if the patch caused the slowdown, all we have is the circumstantial evidence. This also showcases why you get a 10gR2 patchset that is rife with bug fixes and has a footprint that is as large as a baseline install. 10.2.0.3 promises to be more of the same, meaning that 10.2.0.2 had little to no impact on the number of bugs. Granted, 10.2.0.2 did fix a large number of problems, but it looks like the sum of the ones that got through the cracks and newly introduced bugs (ala "Buggy bugs") totals the bugs fixed.

Just to set the record straight, I only complain and gripe about software I love.