Friday, December 15, 2006

Morale at the workplace

It never ceases to amaze me how "official" our upper-admins try to be. First, we caught the wave a hiring and listening to consultants (Gartner Group comes to mind); I have never seen any benefit attributed to those exercises, except several departmental reorganizations, music chairs and the top folks getting richer. Thankfully, some small bit of sanity has returned in the last year or two, but there is still a mentality of "running a good business" that does not translate well when trickled down through the ranks. The latest "trend" is getting certified in ethics, sexual harassment awareness and security protocols. The same thing, year after year, and at a time when our elected state officials are the worse role models in the world.

But I digress...

Morale. Like a perfectionistic parent, it often seems that the message to use in the trenches leans towards "don't do this" or "do this". There is little positive reenforcement for jobs well down, and even less obvious efforts to draw the organization together; skimpy efforts at community building events, lackluster performance evaluation system that leads to general confusion about raises, and even greater confusion about priorities. Too much negative emphasis, too much micromanagement.

The people that stand out are those who lean towards workaholism, whether it be because they feel compelled to work harder or some other reason. The average joe is essentially ignored, and the "operating procedures" from on high can be seen as oppressive measures.

I love my job. I love figuring out problems, exploring new features of the RDBMS, and working with others on a project. Unfortunately, those projects seem all to often like black holes to me - hopefully, surely, somebody has a clear idea why we are investing our time in various projects. Not just "because someone wants it", but how is it progressing along our company's goals.

I love the people I work with; a diverse group in terms of culture, talent, personality and ambition. All sorts of strange conversations come up. =) But better yet, it helps to develop me as a person and DBA.

I guess it would be nice if things were more fun. We work hard, and often the rules we have to follow feel like punishment. I wonder if this is to keep the rebellious few in line, or because some talking head suggested that it was better this way. Whatever it is, it certainly makes me glad I am not running the show.

Monday, November 27, 2006

Wow

What else can I say. =) It would be a miracle if I passed. Wow. In my defense, our test started about 5 hours later than the stated, published time. And for those who have been through that grueling experience, you know that such a time schedule would amplify the difficulty. If I do pass, I will have enormous bragging rights! But, believe you me, I did not pass.

Friday, November 17, 2006

Getting ready for the plunge

This will be my last post before I sit for the OCM. I have mixed feelings about it - the 9i OEM tools are littered with bugs and holes, and while the tool (if you can call it that) may be good for some specific scenarios, I have come to detest it. Especially the java interface, which is cheapened even more on the RedHat platform (some buttons are truncated or all-together missing). On the good side, I have gotten to know RMAN a whole lot better. Reading Robert Freeman's book (and running into him on oracle-l) has been good, and Oracle has improved RMAN so much since 8i that I am actually impressed. It took them a while, but they finally come out with something that is more helpful than painful.

Another thing that concerns me is that I am not a good test-taker. It helps tremendously that this particular Practicum will be scenario based - it will almost be like "normal work". Except no socializing, no music, no Google, and the snacks are .... well, it will sustain me at least. So, not quite like work, but the actual "doing it" will be.

I also look forward to networking. There will be 6 others taking the Practicum with me, and it will be interesting to hear about their backgrounds. And assuming I pass, I am excited about meeting the community of Masters out there. If I do not pass, I will just have to wait a bit longer. *grin*

Based on the public information available about the Practicum, and anticipating a NDA/gag-order, here is what I believe the test will be like (all of this is from memory):

  • Primary focus on "normal" activities
    • install software (not sure how much of this we will have to do, since it takes time)
    • setup environment
      • setup OEM
      • setup RMAN
      • setup/configure database (different flavors, different purposes)
      • setup network files
    • backup/recovery
      • "recover from any failure scenario" is rather intimidating
      • Time is an important resource, so a cold backup every hour is not going to work
    • tune performance
    • manage database
      • add/drop tablespaces, tables, objects
      • adjust storage specs as needed
  • Secondary focus on specific features
    • shared/dedicated servers
    • standby database
    • partitioned tables
    • data access
    • security, including vpd/fgac
    • auditing
  • Tertiary focus on more advanced things that do not really work well (the GUI sucks!)
    • Advanced Replication
    • Resource Management
    • Possibly Dataguard
It is my theory that if you can do the Primary things perfectly, the Secondary things fairly, than the last group is completely optional. I could be wrong, but this is what my intuition tells me. I guess we will see how good that semi-conscious knowledge is. It sounds like backup/recovery might be a big issue. I have been practicing on a really small database, which means that a full backup and restore does not take that long. In the real world, we do incremental backups, but a recovery from an incremental may still have to restore the full backups. Logical corruption will be a bit more interesting - I plan to use db_block_checking and rman's "check logical" to help in that regards, but there still may be things like dropped data (flashback!) or dropped tables and there may be a requirement to recover just that object. On a larger database, that will be interesting. I am thinking that the storage is moderate; it will be a medium sized database (perhaps 20-50 gb), with enough room for a couple full backups, but not much more. If the database is smaller, that would be awesome - smaller means faster backups and easier to manipulate. Perhaps the database will start small but grow as the days goes on.

There is a blurb in the Oracle documentation about the Practicum that says one should use so-called "best practices" and Oracle tools to speed things up, because you are on the clock. I feel pretty confident that I can do most things reasonably fast, but I do have doubts in the back of my head. I can setup a standby database using rman (and using the same technology, create a clone for replication or whatever else). I can use OEM for most of the basic tasks, but I worry about how fast that will actually be, as opposed to just using the command-line which I am more comfortable with. For instance, one of the bullet points for the exam prep is "Use OEM to modify a database configuration". You mean like "alter database set optimizer_index_caching = 80"? Do you realize how many button clicks that is? 8, with a wall clock time of 83 seconds (which obviously depends on how fast your computer is).

In closing, I also wonder how different the 10g OCM is. Could I take that now? I have been working with 10g for a little while now, and know my way around EM/Grid Control. In some respects, it is much better. One of the downsides is that the new EM is so comprehensive that one can easily get lost if you are in unfamiliar territory. See my previous posts about trying to manipulate the Maintenance Window. *grin* I am not going to switch my course of action now - sticking with 9i. But I do wonder... perhaps I can get a steep discount and take the 10g OCM in the near future.

And finally, I remind myself that there are many many smart people out there who have not even taken the OCM just because they do not see a good reason for it. If/when I am a fully certified Master, I still realize my low post on the totem pole. The folks on oracle-l, askTom, the oracle forums (and even the dwellers in metalink Forums) and a slew of other places, are extremely smart and I learn from them on a daily basis. I hope to be able to contribute something, but just because of some pretentious title does not mean that I know everything.

Friday, November 10, 2006

Buggy bugs

This past Sunday (05-NOV-2006) we installed patch 4752541 (Intermittent PLS-306 / ORA-1722 / ORA-1858 under load) to fix a problem our webapp was having on overloaded procedure calls. We were looking golden up until Wednesday when our Production system starting spiking on library cache latch waits (and log file sync waits were in there as well). Being good little Oracle DBAs, we filed a case with Oracle Support, and learned a bit, but nothing really concrete. That evening we hit another major slowdown. It was decided to yank out the Sunday patch because it did exhibit some relationship and we thought it best to be safe. So here we are, Friday, with no more critical slow downs but still hitting the original webapp errors. We now have 3 SRs open with Oracle, 4 engineers working on them, two "team leads"/duty managers keeping tabs, and a cell-phone number for the Director of Oracle Support.

All this to say that Oracle is really, very complex. We still do not even know for sure if the patch caused the slowdown, all we have is the circumstantial evidence. This also showcases why you get a 10gR2 patchset that is rife with bug fixes and has a footprint that is as large as a baseline install. 10.2.0.3 promises to be more of the same, meaning that 10.2.0.2 had little to no impact on the number of bugs. Granted, 10.2.0.2 did fix a large number of problems, but it looks like the sum of the ones that got through the cracks and newly introduced bugs (ala "Buggy bugs") totals the bugs fixed.

Just to set the record straight, I only complain and gripe about software I love.

Monday, October 30, 2006

Oracle has not yet learned the definition of "unbreakable"

Given:

oracle.com on October 30, 2006



Given:

unbreakable


un·break·a·ble (n-brk-bl) Pronunciation Key Audio pronunciation of "unbreakable" [P]
adj.

1. Impossible to break; able to withstand rough usage: unbreakable plates.
2. Able to withstand an attempt to break. Used of a horse.


Given:

Metalink SR tool on Monday, October 30, 2006

Thursday, October 26, 2006

Oracle announces Enterprise Support for Linux, for half the price

This is sad. This exemplifies the statement "You get what you pay for."

Much has been said, and much is still being said, about this poor decision, but my favorite so far is Dave Dargo's Bull*%$#!

Wednesday, October 25, 2006

Support Analyst caught asking Tom

I found this very interesting.
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:707586567563
Search for "We have noticed several system tables where"

Keeping that in mind, here is my question in SR# 5684375.992 on 11-AUG-06:
This SR is for information only. We have noticed several system tables where dba_tab_col_statistics reports HEIGHT BALANCED histograms where bucket_cnt > distcnt (count of distinct values). It was my understanding that a similar problem was fixed in 10.2.0.1 (dbms_stats method_opt size n generates a FREQUENCY histogram where n >= NDV). However, even more perplexing is why do we have many height balanced histogram buckets on a column with only 1 distinct value?

SQL > select ENDPOINT_NUMBER,ENDPOINT_VALUE
from dba_histograms
where table_name = 'INDPART$'
and column_name = 'TS#';

ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
217 79


So, some questions?
1. in sys.hist_head$, how is bucket_cnt calculated?
2. what determines how many buckets a column gets?

Is it just me, or did my support analyst ask my question on AskTom, verbatim?

While I like Tom's cavalier attitude in general (ala "and you don't need/want to be querying hist_head$, just use the normal views like the rest of us :)"), I sure wish he answered the question.

Another interesting point: this question was posed on August 14th, the same day I got fed up with the Support Analyst and asked to switch to someone else. I liked the 2nd guy much better! You could give him awards. But that first guy… wow, he needs help.

As I said, just very interesting. I wonder how many other Support Analysts I can find on AskTom.

Tuesday, October 24, 2006

OCM, histograms and more Oracle

I debated about how publicly I wanted to announce that I am taking the OCM. Seeing as how nobody reads this blog, if anyone ever finds out, it will be after the fact. My impression of the OCM is that it is rather tough to take, but is mostly a PR thing - Oracle gets an easy $2000, and in return, I get bragging rights with a denim shirt. Looks good for our HR department though. The thing is, there are a lot of really smart DBAs out there who could easily pass the OCM and simply have not the time or inclination to do so. I think there is also the stigma that someone who is a certified Master should not make any mistakes and knows everything. *laugh* Don't expect that from me.

So, on to histograms. I am still beleaguered by how best to describe the data. I have tossed the idea of unlimited histogram buckets to Oracle, but the rub I get back is that the overhead is not worth it. Overhead = physical storage + cost of getting bucket information. I will be the first to admit that my number theory is not the strongest - in fact, I am one of the worst kinds of methodical scientists because I am not, well, methodical. I just want to try out a 10-million bucket histogram and see if it actually is worth it for myself. In talking to Wolfgang Breitling, I see there are other ramifications to consider as well, like join cardinalities. Call me naive, but it seems that if you have 100% accurate histograms (no matter how large/how many), your join cardinalities will be 100% accurate as well. That is the problem with a little bit of knowledge - there is so much more you do not know, but you do not necessarily know it. *grin*

More Oracle: Actually, I should say, more SCT. We do a lot of work with SCT (now called SunGard Higher Education, or SHE) BANNER, an ERP from California. As with any monstrous ERP, there are a number of issues, but probably the worst is that they were the lowest bidder, and we are their largest customer. Do the math. Good for them, bad for us. And we who do the performance diagnostics and troubleshooting affirm that every day. Again, going back to Wolfgang, I wish there was an outside expert/consultant in the realm of Oracle/BANNER, a real Application DBA. At this point in my career, I am saying to myself "That's what I want to be when I grow up." Maybe not necessarily for SCT/SHE (and they sorely need them!), but .... maybe.

Another dream I have is to get inside the guts of the Oracle kernel. This is hard to do when
  1. you don't work for Oracle
  2. you despise the scientific method
  3. you play a lot of video games
But, I find that I have a number of questions. Questions I would love to document here, when I get a chance to kick myself in the butt and update this blog more regularly. Questions of how things work, and why they were designed that way. I am fully confident that a majority of the design desicions were made by very smart people who think, breathe and live databases. But sometimes you just wonder.

So if you know of any rogue Oracle developers that teach Internal classes, let me know. *grin*

Thursday, September 21, 2006

A Study in Irony

Metalink note #388422.1

Oracle says one of the best things you can do in regards to early detection of corrupted blocks is turn on db_block_checking. But wait, what is this I see?
Bug 4684074 OERI:510 / block corruption (ORA-1578) with DB_BLOCK_CHECKING

Just to make sure that db_block_checking does actually find corrupted blocks, it will do the dirty work for you.

Monday, September 11, 2006

Oracle bug status codes

11: under development review / research
16: under BDE / platform review & research
10: need more information from customer (dev)
30: need more information (bde)
35: means the engineer needs to review
36: means duplicate issue
52: pending approval of platform line (PL)

The lure of histograms

I had been doing a lot of research into histograms. I scooped a lot of references (which I will list later), and filled my head with no shortage of verbiage. Our current dilemma now is, "Does one introduce more problems when proactively creating targeted histograms (frequency histograms on non-unique columns) than you solve?" I am of the opinion that the benefits outweigh the risks. But I am being cautious because people smarter than I (ie, Jonathan Lewis, Wolfgang Breitling, Dave Ensor) all exercise extreme prejudice and use histograms as little as absolutely necessary. Or at least, that is my understanding. I firmly believe that if you tell the CBO absolutely all the facts about your data and resources, you will get the optimal plan 100% of the time. Which infers that if you fail to tell the truth 100% of the time, you are running the chance of getting a "bad" plan, or even the "worst" plan.

And than the whole topic of equi-depth histograms, or "height balanced". Granted, there will always be a limit to the number of buckets you can use, currently 254 in Oracle 10.2.0.2. If you have more distinct values than max buckets, you will have to resort to summarizing your data as best as possible. This is a vast, grey area. On the one hand, you can shoot in the dark, sample your data and hope for the best (which works most of the time). A slightly better method would be to describe your most popular values. However, the whole exercise is moot if your business needs require specific information about the least-popular values. This is the part that makes equi-depth histograms so messy. There is no silver bullet.

For this reason alone, I wish Oracle would vastly increase the number of histogram buckets one could collect, so as to avoid the whole guessing game. You have to admit, it is much easier to calculate frequency histograms than dive into your data and business logic to determine which data most needs to be described. Yes, it would also be more resource intensive, but my point is it would be easier on the dba.

Tuesday, August 15, 2006

First note

I wanted to test out the new beta features of Blogger, and I realized that my old Oracle-related blog was suffering from some withdrawal, so I am hoping to merge the old with the new at some point in the near future.