datapump scenarios
Short Description
datapump , expdp impdp scenarios...
Description
http://myorastuff.blogspot.in/20 http://myorastuff. blogspot.in/2008/08/expdp-impdp. 08/08/expdp-impdp.html html Data pump is a new feature in Oracle10g that provides fast parallel data load. With d irect path and parallel execution, data pump is several times faster then the traditional exp/imp. Traditional exp/imp runs on client side. But impdp/expdp runs on server side. So we have much control on expdp/expdp compared to traditional exp/imp. When compared to exp/imp, data pump startup time is longer. Because, it has to setup the jobs, queues, and master table. Also at the end of the export operation the master table data is written to the dump file set, and at the beginning of the import job the master table is located and loaded in the schema of the user. Following are the process involved in the data pump operation:
Client Process : This process is initiated by client utility. This process makes a call to the data pump API. Once the data pump is initiated, this process is not necessary for the progress of the job. Shadow Process : When client log into the database, foreground process is created. It services the client data pump API requests. This p rocess creates the master table and creates Advanced queuing queues used for communication. Once client process ends, shadow process also go away. Master Control Process : MCP controls the execution of the data pump job. There is one MCP per job. MCP divides the data pump pu mp job into various metadata and data load or unload jobs and hands them over to the worker processes. Worker Process : MCP creates worker process based on the valule of the PARALLEL parameter. The worker process performs the task requested by MCP. Advantage of Data pump 1. We can perform export in parallel. It can also write to multiple files on different disks. (Specify parameters PARALLEL=2 and the two directory names with f ile specification DUMPFILE=ddir1:/file1.dmp, DDIR2:/file2.dmp) 2. Has ability to attach and det ach from job, monitor the job progress remotely. 3. Has more option to filter metadata objects. Ex, EXCLUDE, INCLUDE 4. ESTIMATE_ONLY option can be used to estimate disk space requirements before performs the job 5. Data can be exported from remote database by using Database link 6. Explicit DB version can be specified, so only supported object types are exported. 7. During impdp, we can change the target file names, schema, and tablespace. Ex, REMAP_SCHEMA, REMAP_DATAFILES, REMAP_TABLESPACE 8. Has the option to filter data rows during impdp. Traditional exp/imp, we have this filter option only in exp. But here we have filter option on both impdp, expdp. 9. Data can be imported from one DB to another without writing to dump file, using NETWORK_LINK parameter. 10. Data access methods are decided automatically. In traditional exp/imp, we specify the value for the parameter DIRECT. But here, it decides where direct path can not be used , conventional path is used. 11. Job status can be queried directly from data dictionary(For example, dba_datapump_jobs, dba_datapump_sessions etc)
Exp & Expdp common parameters: These below parameters exists in both traditional exp and expdp utility. FILESIZE FLASHBACK_SCN FLASHBACK_TIME FULL HELP PARFILE QUERY TABLES TABLESPACES TRANSPORT_TABLESPACES(exp TRANSPORT_TABLESPAC ES(exp value is Y/N, expdp value is name of the tablespace) t ablespace)
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html Comparing exp & expdp parameters: These below parameters are equivalent parameters between exp & expdp. Exp and corresponding Expdp parameters... FEEDBACK => STATUS FILE => DUMPFILE LOG => LOGFILE OWNER => SCHEMAS TTS_FULL_CHECK => TRANSPROT_FULL_CHECK
New parameters in expdp Utility ATTACH Attach the client session to existing data pump jobs CONTENT Specify what to export(ALL, DATA_ONLY, METADATA_ONLY) DIRECTORY Location to write the dump file and log file. ESTIMATE Show how much disk space each table in the export job consumes. ESTIMATE_ONLY It estimate the space, but does not perform export EXCLUDE List of objects to be excluded INCLUDE List of jobs to be included JOB_NAME Name of the export job KEEP_MASTER Specify Y not to drop the master table after export NETWORK_LINK Specify dblink to export from remote database NOLOGFILE Specify Y if you do not want to create log file PARALLEL Specify the maximum number of threads for the export job VERSION DB objects that are incompatible with the specified version will not be exported. ENCRYPTION_PASSWORD The table column is encrypted, then it will be written as clear text in the dump file set when the password is not specified. We can define any string as a password for this parameter. COMPRESSION Specifies whether to compress metadata before writing to the dump file set. The default is METADATA_ONLY. We have two values(METADATA_ONLY,NONE). We can use NONE if we want to disable during the expdp. SAMPLE - Allows you to specify a percentage of data to be sampled and unloaded from the source database. The sample_percent indicates the probability that a block of rows will be selected as part of the sample. Imp & Impdp common parameters: These below parameters exist in both traditional imp and impdp utility. FULL HELP PARFILE QUERY SKIP_UNUSABLE_INDEXES TABLES TABLESPACES Comparing imp & impdp parameters: These below parameters are equivalent parameters between imp & impdp. imp and corresponding impdp parameters... DATAFILES => TRANSPORT_DATAFILES DESTROY =>REUSE_DATAFILES FEEDBACK =>STATUS FILE =>DUMPFILE FROMUSER =>SCHEMAS, REMAP_SCHEMAS IGNORE =>TABLE_EXISTS_ACTION(SKIP,APPEND,TRUNCATE,REPLACE) INDEXFILE, SHOW=>SQLFILE LOG =>LOGFILE TOUSER =>REMAP_SCHEMA New parameters in impdp Utility
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html FLASHBACK_SCN Performs import operation that is consistent with the SCN specified from the source database. Valid only when NETWORK_LINK parameter is used. FLASHBACK_TIME Similar to FLASHBACK_SCN, but oracle finds the SCN close to the time specified. NETWORK_LINK Performs import directly from a source database using database link name specified in the parameter. The dump file will be not be created in server when we use this parameter. To get a consistent export from the source database, we can use the FLASHBACK_SCN or FLASHBACK_TIME parameters. These two parameters are only valid when we use NETWORK_LINK parameter. REMAP_DATAFILE Changes name of the source DB data file to a different name in the target. REMAP_SCHEMA Loads objects to a different target schema name. REMAP_TABLESPACE Changes name of the source tablespace to a different name in the target. TRANSFORM We can specify that the storage clause should not be generated in the DDL for import. This is useful if the storage characteristics of the source and target database are different. The valid values are SEGMENT_ATTRIBUTES, STORAGE. STORAGE removes the storage clause from t he CREATE statement DDL, whereas SEGMENT_ATTRIBUTES removes physical attributes, tablespace, logging, and storage attributes. TRANSFORM = name:boolean_value[:object_type], where boolean_value is Y or N. For instance, TRANSFORM=storage:N:table ENCRYPTION_PASSWORD It is required on an import operation if an encryption password was specified on the export operation. CONTENT, INCLUDE, EXCLUDE are same as expdp utilities.
Prerequisite for expdp/impdp: Set up the dump location in the database. system@orcl> create directory dumplocation 2 as 'c:/dumplocation'; Directory created. system@orcl> grant read,write on directory dumploc to scott; Grant succeeded. system@orcl> Let us experiment expdp & impdp utility as different scenario...... We have two database orcl, ordb. All the below scenarios are tested in Oracle10g R2 version. Scenario1 Export the whole orcl database. Export Parfile content: userid=system/password@orcl dumpfile=expfulldp.dmp logfile=expfulldp.log full=y directory=dumplocation Scenario2 Export the scott schema from orcl and import into ordb database. While import, exclude some objects(sequence,view,package,cluster,table). Load the objects which came from RES tablespace into USERS tablespace in target database. Export Parfile content: userid=system/password@orcl dumpfile=schemaexpdb.dmp logfile=schemaexpdb.log directory=dumplocation schemas=scott Import parfile content: userid=system/password@ordb dumpfile=schemaexpdb.dmp logfile=schemaimpdb.log directory=dumplocation
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html table_exists_action=replace remap_tablespace=res:users exclude=sequence,view,package,cluster,table:"in('LOAD_EXT')" Scenario3 Export the emp table from scott schema at orcl instance and import into ordb instance. Expdb parfile content: userid=system/password@orcl logfile=tableexpdb.log directory=dumplocation tables=scott.part_emp dumpfile=tableexpdb.dmp Impdp parfile content: userid=system/password@ordb dumpfile=tableexpdb.dmp logfile=tabimpdb.log directory=dumplocation table_exists_action=REPLACE Scenario4 Export only specific partition in emp table from scott schema at orcl and import into ordb database. Expdp parfile content: userid=system/password@orcl dumpfile=partexpdb.dmp logfile=partexpdb.log directory=dumplocation tables=scott.part_emp:part10,scott.part_emp:part20 Impdp parfile content: If we want to overwrite the exported data in target database, then we need to delete emp table for deptno in(10,20). scott@ordb> delete part_emp where deptno=10; 786432 rows deleted. scott@ordb> delete part_emp where deptno=20; 1310720 rows deleted. scott@ordb> commit; Commit complete. userid=system/password@ordb dumpfile=partexpdb.dmp logfile=tabimpdb.log directory=dumplocation table_exists_action=append Scenario5 Export only tables in scott schema at orcl and import into ordb database. Expdp parfile content: userid=system/password@orcl dumpfile=schemaexpdb.dmp logfile=schemaexpdb.log directory=dumplocation include=table schemas=scott Impdp parfile content: userid=system/password@orcl dumpfile=schemaexpdb.dmp logfile=schemaimpdb.log directory=dumplocation table_exists_action=replace
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html Scenario6 Export only rows belonging to department 10 and 20 in emp and dept table from orcl database. Import the dump file i n ordb database. While importing, load only deptno 10 in target database. Expdp parfile content: userid=system/password@orcl dumpfile=data_filter_expdb.dmp logfile=data_filter_expdb.log directory=dumplocation content=data_only schemas=scott include=table:"in('EMP','DEPT')" query="where deptno in(10,20)" Impdp parfile content: userid=system/password@ordb dumpfile=data_filter_expdb.dmp logfile=data_filter_impdb.log directory=dumplocation schemas=scott query="where deptno = 10" table_exists_action=APPEND Scenario7 Export the scott schema from orcl database and split the dump file into 50M sizes. Import the dump file into ordb datbase. Expdp parfile content: userid=system/password@orcl logfile=schemaexp_split.log directory=dumplocation dumpfile=schemaexp_split_%U.dmp filesize=50M schemas=scott include=table As per the above expdp parfile, initially, schemaexp_split_01.dmp file will be created. Once the file is 50MB, the next file called schemaexp_split_02.dmp will be created. Let us say, the dump file size is 500MB, then it creates 10 dump file as each file size is 50MB. Impdp parfile content: userid=system/password@ordb logfile=schemaimp_split.log directory=dumplocation dumpfile=schemaexp_split_%U.dmp table_exists_action=replace remap_tablespace=res:users exclude=grant Scenario8 Export the scott schema from orcl database and split the dump file into four files. Import the dump file into ordb datbase. Expdp parfile content: userid=system/password@orcl logfile=schemaexp_split.log directory=dumplocation dumpfile=schemaexp_split_%U.dmp parallel=4 schemas=scott include=table As per the above parfile content, i nitially four files will be created - schemaexp_split_01.dmp, schemaexp_split_02.dmp, schemaexp_split_03.dmp, schemaexp_split_04.dmp. Notice that every occurrence of the substation variable is incremented each time. Since there is no FILESIZE parameter, no more files will be created. Impdp parfile content:
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html userid=system/password@ordb logfile=schemaimp_split.log directory=dumplocation dumpfile=schemaexp_split_%U.dmp table_exists_action=replace remap_tablespace=res:users exclude=grant Scenario9 Export the scott schema from orcl database and split the dump file into three files. The dump files will be stored in three different location. This method is especially useful if you do not have enough space in one file system to perform the complete expdp job. After export is successful, import the dump file into ordb database. Expdp parfile content: userid=system/password@orcl logfile=schemaexp_split.log directory=dumplocation dumpfile=dump1:schemaexp_%U.dmp,dump2:schemaexp_%U.dmp,dump3:schemaexp_%U.dmp filesize=50M schemas=scott include=table As per above expdp par file content, it place the dump file into three different location. Let us say, entire expdp dump file size is 1500MB. Then it creates 30 dump files(each dump file size is 50MB) and place 10 files in each file system. Impdp parfile content: userid=system/password@ordb logfile=schemaimp_split.log directory=dumplocation dumpfile=dump1:schemaexp_%U.dmp,dump2:schemaexp_%U.dmp,dump3:schemaexp_%U.dmp table_exists_action=replace Scenario10 We are in orcl database server. Now export the ordb data and place the dump file in orcl database server. After expdp is successful, import the dump file into orcl database. When we use network_link, the expdp user and source database schema users should have identical privileges. If there no identical privileges, then we get the below error. C:\impexpdp>expdp parfile=networkexp1.par Export: Release 10.2.0.1.0 - Production on Sunday, 17 May, 2009 12:06:40 Copyright (c) 2003, 2005, Oracle. All rights reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Produc tion With the Partitioning, OLAP and Data Mining options ORA-31631: privileges are required ORA-39149: cannot link privileged user to non-privileged user Expdp parfile content: userid=scott/tiger@orcl logfile=netwrokexp1.log directory=dumplocation dumpfile=networkexp1.dmp schemas=scott include=table network_link=ordb As per the above parfile, expdp utility exports the ordb database data and place the dump file in orcl server. Since we are running expdp in orcl server. This is basically exporting the data from remote database. Impdp parfile content:
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html userid=system/password@orcl logfile=networkimp1.log directory=dumplocation dumpfile=networkexp1.dmp table_exists_action=replace Scenario11 Export scott schema in orcl and import into ordb. But do not write dump file in server. The expdp and impdp should be completed with out writing dump file in the server. Here we do not need to export the data. We can import the data without creating the dumpfile. Here we run the impdp in ordb server and it contacts orcl DB and extract the data and import into ordb database. If we do not have much space in the file system to place the dump file, then we can use this option to load the data. Impdp parfile content: userid=scott/tiger@ordb network_link=orcl logfile=networkimp2.log directory=dumplocation table_exists_action=replace Scenario12 Expdp scott schema in ordb and impdp the dump file in training schema in ordb database. Expdp parfile content: userid=scott/tiger@orcl logfile=netwrokexp1.log directory=dumplocation dumpfile=networkexp1.dmp schemas=scott include=table Impdp parfile content: userid=system/password@ordb logfile=networkimp1.log directory=dumplocation dumpfile=networkexp1.dmp table_exists_action=replace remap_schema=scott:training Scenario 13 Expdp table on orcl database and imdp in ordb. When we export the data, e xport only 20 percent of the table data. We use SAMPLE parameter to accomplish this task. SAMPLE parameter allows you to export subsets of data b y specifying the percentage of data to be sampled and exported. The sample_percent indicates the probability that a block of rows will be selected as part of the sample. It does not mean that the database will retrieve exactly that amount of rows from the table. The value y ou supply for sample_percent can be anywhere from .000001 up to, but not including, 100. If no table is specified, then the sample_percent value applies to the entire export job. The SAMPLE parameter is not valid for network exports. Expdp parfile content: userid=system/password@orcl dumpfile=schemaexpdb.dmp logfile=schemaexpdb.log directory=dumplocation tables=scott.part_emp SAMPLE=20 As per the above expdp parfile, it exports only 20 percent of the data in part_emp table. Impdp parfile content: userid=system/password@ordb dumpfile=schemaexpdb.dmp logfile=schemaimpdb.log
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html directory=dumplocation table_exists_action=replace
Managing Data Pump jobs The datapump clients expdp and impdp provide an interactive command interface. Since each expdp and impdp operation has a job n ame, you can attach to that job from any computer and monitor the job or make adjustment to the job. Here are the data pump interactive commands. ADD_FILE Adds another file or a file set to the DUMPFILE set. CONTINUE_CLIENT Changes mode from interactive client to logging mode EXIT_CLIENT Leaves the client session and discontinues logging but leaves the current job running. KILL_JOB Detaches all currently attached client sessions and terminates the job PARALLEL Increase or decrease the number of threads START_JOB Starts(or resume) a job that is not currently running. SKIP_CURRENT option can skip the recent failed DDL statement that caused the job to stop. STOP_JOB stops the current job, the job can be restarted later STATUS Displays detailed status of the job, the refresh interval can be specified in seconds. The detailed status is displayed to the output screen but not written to the log file. Scenario14 Let us start the job and in between, we stop the job in middle and resume the job. After some time, let us kill the job and check the job status for every activity.... We can find what jobs are running currently in the database by using the below query. SQL> select state,job_name from dba_datapump_jobs; STATE JOB_NAME ------------------------------ -----------------------------EXECUTING SYS_IMPORT_FULL_01 SQL> C:\impexpdp>impdp parfile=schemaimp1.par Import: Release 10.2.0.1.0 - Production on Sunday, 17 May, 2009 14:06:51 Copyright (c) 2003, 2005, Oracle. All rights reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Produc tion With the Partitioning, OLAP and Data Mining options Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded Starting "SYSTEM"."SYS_IMPORT_FULL_01": parfile=schemaimp1.par Processing object type SCHEMA_EXPORT/TABLE/TABLE Import> stop_job Are you sure you wish to stop this job ([yes]/no): yes C:\impexpdp>
When we want to stop the job, we need press Control-M to returnImport> prompt. Once it is returned to prompt(Import>), we can stop the job as above by using stop_job command. After the job is stoped, here is the job status. SQL> select state,job_name from dba_datapump_jobs; STATE JOB_NAME ------------------------------ -----------------------------NOT RUNNING SYS_IMPORT_FULL_01
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html SQL> Now we are attaching job again..... Attaching the job does not restart the job. C:\impexpdp>impdp system/password@ordb attach=SYS_IMPORT_FULL_01 Import: Release 10.2.0.1.0 - Production on Sunday, 17 May, 2009 14:17:11 Copyright (c) 2003, 2005, Oracle. All rights reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Produc tion With the Partitioning, OLAP and Data Mining options Job: SYS_IMPORT_FULL_01 Owner: SYSTEM Operation: IMPORT Creator Privs: FALSE GUID: 54AD9D6CF9B54FC4823B1AF09C2DC723 Start Time: Sunday, 17 May, 2009 14:17:12 Mode: FULL Instance: ordb Max Parallelism: 1 EXPORT Job Parameters: CLIENT_COMMAND parfile=schemaexp1.par IMPORT Job Parameters: Parameter Name Parameter Value: CLIENT_COMMAND parfile=schemaimp1.par TABLE_EXISTS_ACTION REPLACE State: IDLING Bytes Processed: 1,086,333,016 Percent Done: 44 Current Parallelism: 1 Job Error Count: 0 Dump File: c:/impexpdp\networkexp1.dmp Worker 1 Status: State: UNDEFINED Import> After attaching the job, here is the job status. SQL> select state,job_name from dba_datapump_jobs; STATE JOB_NAME ------------------------------ -----------------------------IDLING SYS_IMPORT_FULL_01 SQL>
Attaching the job does not resume the job. Now we are resuming job again..... Import> continue_client Job SYS_IMPORT_FULL_01 has been reopened at Sunday, 17 May, 2009 14:17 Restarting "SYSTEM"."SYS_IMPORT_FULL_01": parfile=schemaimp1.par SQL> select state,job_name from dba_datapump_jobs; STATE JOB_NAME ------------------------------ -----------------------------EXECUTING SYS_IMPORT_FULL_01
http://myorastuff.blogspot.in/2008/08/expdp-impdp.html SQL> Now again we are killing the same job.... Before we kill, we nee d to press Control-C to return the Import> prompt. Import> kill_job Are you sure you wish to stop this job ([yes]/no): yes C:\impexpdp> Now the job is disappared in the database. SQL> select state,job_name from dba_datapump_jobs; no rows selected SQL> Posted by Govindat 12:30 PM Labels: ATTACH, COMPRESSION, CONTINUE_CLIENT, create directory,dbms_datapump, ENCRYPTION_PASSWORD, EXCLUDE, expdp, impdp,KEEP_MASTER, KILL_JOB, NETWORK_LINK, REMAP_SCHEMA, SAMPLE,TABLE_EXISTS_ACTION 3 comments:
Pratap said... Hi Govind, Thanks for sharing the knowledge. However one of the issues I found in the article is as below transpoart_tablespace's' parameter exists only in expdp value being tablespace name Whereas exp has transportable_tablespace value being (Y/N) Thanks and Regards, Pratap July 5, 2009 at 9:23 PM
Govind said... Pratap, Thank you for visiting my blog. I corrected the article and Thank you for your input. I highly appreciate your valuable correction. September 14, 2009 at 6:49 PM
waheed said... excellent info on expdp and impdp scenarios...
View more...
Comments