Friday, December 21, 2007

Data Pump Rocks!

For the past week, we dove into Data Pump with a vengeance. I have never played with it before; I think I may have tried a table export or two, but nothing really serious. So last week I had no clue what a huge improvement over the "traditional" exp/imp Data Pump is. Of course, it goes without saying that there will be bugs and gotchas, but that pretty much comes with the territory. Sad but true.

The purpose we engaged in this activity in the first place is that our ERP ostentatiously decided that they want to support international characters, which means that all of us state-side have to upgrade regardless of any character needs. They didn't ask me! *grin* But the silver lining is that we have learned so much about Data Pump.

Just for the sake of numbers, we were able to pump the data from a 478gb database to a set of dumpfiles totaling 175gb in about 3 hours, and then turn around and pump that back into an empty database in about 4 hours or so. Yes, there are some hidden truths in there. For instance, we completely ignored indexes; they will explode your timeframe significantly.

EXPDP
We hit some issues early on with the export. At first, we thought we were hitting the LOB issue. After filing an SR, we learned of Metalink note 286496.1 which covers tracing and an "undocumented" METRICS parameter. METRICS seems like a vastly handy little piece of information, so I am quite flabbergasted that it is "undocumented". According to the note, we set TRACE=480300, which has some trace information for the Master and Worker processes. In addition, I set event 10046 to get the waits. That was eye opening. The database was spending an enormous amount of time waiting on "library cache pin" while creating Global Temporary Tables. Very odd. After playing ping-pong with the Support Analyst and thinking about it for a while, I realized that all of the objects experiencing a wait had FGAC enabled for VPD. Ok, one strike against me for not choosing a VPD-free login (ie, one that has been granted EXEMPT ACCESS POLICY), but one strike for Data Pump for doing something rather poorly. I am hoping to hear more about this particular behavior and how it will be resolved.

We also set our TEMP space to extend without limit and set the pools (shared pool and buffer cache) a bit higher than normal.

The parameter file we used:
directory=DPUMP
dumpfile=${ORACLE_SID}_full%U.dmp
logfile=${ORACLE_SID}_full_debug.log
full=y
parallel=16
metrics=y
userid="xxx/yyy"
TRACE=480300


IMPDP
The very first thing I tried was NETWORK_LINK; for "traditional" exp/imp, we use a pipe as an intermediate file instead of exp to a dump file and imp from the dump file. Since Data Pump writes asynchronously, this is not possible, but the alternative is to communicate via a Database Link. Unfortunately, LONG objects are not yet supported via this method, excluding this option as a viable method.

The next problem we encountered was that the metadata contained a tablespace specification for a non-existent tablespace on two of our partitioned tables. This turned out to be extremely counterintuitive. If you precreate the table (on the proper, existing tablespaces), IMPDP will fail trying to create the table (on the wrong, non-existing tablespace). Even if you specify TABLE_EXISTS_ACTION=TRUNCATE!! Our Support Analyst is telling me that this is the expected behavior. Was not my expectation at all. To fix it, we create the tablespace and viola, we have a working import process.

Lastly, we struggled for a long time with the arduous process of creating indexes. Data Pump
says it is creating indexes in parallel. In reality, it using the parallel degree clause of the CREATE INDEX statement, utilizing the RDBMS parallel server processes. This seems rather antithetical to the rest of Data Pump, especially if you consider that no matter how much parallelize, you bottleneck with the query coordinator. I much prefer that Data Pump use a Parallel Degree of 1 for the index creations, but launch multiple creations at the same time. In fact, I downloaded a script called paresh. I had to modify it a bit to use the DDL generated by IMPDP Metadata for indexes, but it seems to work quite well. I need to modify it more for error checking, but at least it creates indexes truly in parallel.

So, with that out of the way, we are now working on a "clean" import given these exceptions. For the init.ora, we use:
nls_length_semantics = CHAR

## Import Debugging
max_dump_file_size = unlimited
# event="10046 trace name context forever, level 12"

## Import Speed-up parameters
shared_pool_size = 1000M
sga_max_size = 2000M
parallel_max_servers = 24
_disable_logging = TRUE
DB_BLOCK_CHECKSUM=FALSE ## DEFAULT = TRUE
## DISK_ASYNCH_IO=TRUE ## DEFAULT
## DB_BLOCK_CHECKING=FALSE ## DEFAULT


A note of WARNING about _disable_logging. It is an underscore parameter, so all the usual warnings accompany that. I found out the hard way what happens if you shutdown abort while attempting to rollback sql statements:
SMON: following errors trapped and ignored:
ORA-01595: error freeing extent (3) of rollback segment (1))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4193], [114], [232], [], [], [], [], []

More information can be found in Metalink Note 39282.1.

Our IMPDP parameter file is similar to EXPDP, but excluding some objects:
directory=DPUMP
dumpfile=${ORACLE_SID}_full%U.dmp
logfile=${ORACLE_SID}_full_no_indexes.log
full=y
parallel=16
metrics=y
userid="xxx/yyy"
TRACE=480300
EXCLUDE=index,constraint


We are still playing with the fastest way to migrate constraints. One thought is to do two passes:
  • import metadata
  • disable constraints
  • import data
  • enable constraints in parallel
  • build indexes in parallel

Another thought is to disable the constraints on the source, but that may not be practical for our situation. We will see.

In the End
I am optimistic that our latest changes are going to produce some fast numbers. Of course, there are other areas that could be tweaked (place dumpfiles on independent mount points, for example), but we went with the low-hanging fruit, and I think we scored. Another outcome is that we have made contact with the Data Pump Product Manager and her supervisor, which is priceless! They are excellent people and very patient and willing to listen. Amazing folks over there!

5 comments:

Anonymous said...

I found that running impdp with exclude=statistics
And then collecting statistics on the imported objects is significantly faster than waiting for impdp to import the statistics (at least on 10.2)

Charles Schultz said...

Since we are excluding the indexes, the stats for those indexes are excluded as well. The metadata file has all those stats, which get created when I run paresh.

But thanks for the note. If you spot anything else that can help us out, be sure to let me know. =)

Roy Swonger says that some good things are coming up with 11g. And I hope we can get some of these Enhancement Requests in soon.

Unknown said...

Charles, nice posting.
I would just suggest enabling constraints with 'no validate' clause assuming that they were valid in your source database. That's how I saved some (significant) time during similar 'db refresh' process in my environment.
I would also try to create indexes by running several scripts in parallel (each of them creates its indexes with parallel option as well). This will require some testing in order to find the most efficient number of scripts and value for 'PARALLEL' parameter (depending on number of CPUs and IO layout in your environment).
Regards ...

Charles Schultz said...

Thanks Milan. We are trying your 2nd suggestion right now. I am curious, how does invalidating the constraints help in this situation? And I assume you mean in the source database, since I do not see how that would help in the target.

Thanks again,

Unknown said...

Charles,

I wasn't 'clear' enough in regards to constraints. I wouldn't invalidate any constraints, especially not in source, production db.
This is what I'd do :
- you don't imp constraints using DataPump
- after your data are imported into target db, you run script(s) to create/enable constraints.
- if you run 'regular' sql like 'alter table X add constraint ...enable', Oracle will validate those constraints (going through all data checking if they satisfy constraint conditions) and that can take a while especially for huge tables (this is from my experience ...)
- what I suggest is to run something like 'alter table X add constraint ... enable NOVALIDATE'. This way you tell Oracle ' my existing data are fine since they are from production, don't check it, enable constraints from now on'
- you can run 'validate' scripts later just in case if you want to make sure that everything is fine
-I would do above for all kind of constraints, just take care of the order in which you create them : PK first, then unique,check,FKs,etc, ... I mean FKs will fail if there is no PK they refer to...

One more thing , for primary keys on huge tables, I would create (PK) indexes during 'create indexes' phase. If you don't, Oracle will create these indexes when create primary keys and it won't run in parallel. If you create index on PK column before create PK, Oracle will recognize that index when try to create PK and will use it ...

Also, all above depends on what you want to achieve ... If you can 'survive' with refreshing process taking +-30 minutes (or so) then check if this is worth of your testing time at all. But if every minute counts - I would go for it.
Hope this helped a little more than my last comment...