Move or Copy DB2 Data

20 January 2009

 

Probably one of the subject areas that people ask me about most often is how to move their data from one system to another.  Some want a copy of production data on a test system.  Others need multiple copies of test databases so that different groups of developers can test without interfering with each other.  Still others need to copy a tablespace or group of tables from one database to another.  The good news is that there are several options and some will be better than others depending on what you are trying to accomplish.  This list is intended to help you narrow your choices for moving data.  Once you narrow the alternatives you will probably want to read more in depth about them in the Data Movement Utilities Guide and Reference manual or other links noted below.   Several of these methods for moving data do more than what is described here, but I mostly just describe how they are used to move data.

 

Backup/Restore

One of the easiest ways to copy a full database from one platform to another is to back it up from the source and restore it to another.  If the directory and/or drive letters are identical on the target and source servers then you just need to perform the backup and restore steps.  However, if they are different then you need to perform a redirected restore.  In a redirected restore, you create a script to tell DB2 where you want to put the files under each tablespace using the SET TABLESPACE CONTAINER command.  This needs to be done for each tablespace.  The really good news for those of you using DB2 v9.1 and above is that you can tell DB2 to generate the SET TABLESPACE CONTAINER commands from the backup file.  Once the script is generated you just need to change the directory names to correspond to the directories on the new server.  You can see a step by step example of doing a redirected restore at Cloning DB2 Databases Using Redirected Restore.

 

The process will also differ depending on whether you can do an off line backup or if you must use an online backup because the source system processing may not be interrupted.  You can restore a backup taken on a different Operating System in some cases and not in others.  Here is a link to the possible inter-OS backup/restores that are allowed:  Restoring between OS.  You can restore online and offline backups between systems with identical versions of DB2.  For offline backups only, you can restore to DB2 instances running the same or a newer version of DB2 than the one that you backed up.  For example, I can restore online and offline backups when my target and source are both v9.1, but if I backup a database on a v9.1 instance and want to restore it to a system running v9.5, then that backup must be taken offline.  You can never restore a backup taken on a newer version to a system running an older version of DB2.  When you restore an online backup you must have all of the necessary transaction log files or the restore and roll forward will not complete and you will not have a useable database.  Where you can, it is a good idea to use the “INCLUDE LOGS” parameter in the online backup.  This will ensure that you have at least all of the transaction logs that you need to recover the database to a useable state.

 

Export/Import/Load

With these utilities, you can EXPORT data from a table and then use the IMPORT or LOAD utility to insert the data into a table in the same or a different database.  The export and import utilities use SQL to extract and add data.  The load utility is often faster than the import because it bypasses the DB2 SQL engine and builds physical database pages and puts them directly into the tablespace.  These utilities can be used to move data between databases on any operating systems and versions.  You can also use them to add data in files to a database.  While the load command is often much faster than the import utility and can load very large amounts of data to a table quickly, there are some aspects of the LOAD command that you need to watch.  If a table that you load has referential constraints with other tables, then you will usually need to run the SET INTEGRITY command after loading your tables.  This command verifies that the referential and all other constraints are valid.  If there are constraints and you do not use the SET integrity command, then the loaded table and its children may not be accessible.  You should also understand the COPY YES/NO parameters of the LOAD command.  Depending on which you choose, the tablespace that contains the table that you loaded could be placed into “backup pending” status and no access to any of the tables in that tablespace will be allowed until a backup is done.  These utilities allow you to copy data between different versions of DB2 as well as databases on different Operating Systems.

 

LOB and XML Considerations

Large Objects (LOBs) and XML columns can be imported, exported and loaded without using the special LOB and XML parameters, but with significant limitations.  For LOB data, columns are limited to 32K in these utilities if you don’t use the LOB parameters.  When exporting tables with columns having the XML data types, the XML columns are always exported to separate files from the regular relational data.  Before using any of these utilities or the db2move command, it would be wise to read and understand how these utilities handle these special data types.  This information is in the Data Movement Utilities Guide and Reference manual.

 

db2move

The db2 move utility is used to copy several tables at once.  With the export, import and load utilities you must create a script for each table that you want the utility to process.  The db2move utility automates that process for moving all or a large subset or tables in your database.   You start by running the db2move command on the source system to extract the source data.  It will use the export command for each table specified.  The command will create a file or set of files for each table in a directory that you specify.  It also creates a file with a list of tables exported.  You then copy all files from this directory to the target system if the source and target databases are on different systems.  Finally you run the db2move command on the target system with a parameter specifying load or import depending on how you want the data placed into the database.  There are several options for specifying the tables that you would like exported including all tables in the database, a list of table names, a list of tablespaces, a list of schemas, and so on.  This option allows you to copy data between different versions of DB2 as well as databases on different Operating Systems.

 

Load from Cursor

The CURSOR option in the load command allows you to connect to another database and have the load utility select data from a table on the source and load it directly into the target.  One very nice attribute of this process is that you do not need to have space for an intermediate file in which to extract the table data, because the data is loaded as it is extracted.  Please note that if your source database is DB2 on i5/OS or z/OS then you must be licensed for DB2 Connect.  This option allows you to copy data between different versions of DB2 as well as databases on different Operating Systems.

 

db2relocatedb

This command allows you to move or copy a database to a new system or a different instance in the same system by copying the underlying files and directories (containers) to the new location.  You use operating systems to copy the database files that contain the database and tablespace data and then use the db2relocatedb command to tell the new instance about the database.  This does not work if you are trying to move a database to an instance that already has one or more databases.  This utility can also be used to rename a database and to rename files used by the database or tablespace.  This command only allows you to move databases between identical Operating Systems and DB2 Versions.

 

db2look

This utility allows you to create a new database with the same structure as the old one, but it does not copy data.  It generates a DDL script that in a human readable format that can be run to create an empty database that has the same structure as the source.  The file can be modified by the user as well.  It can also be used to copy production database statistics to a much smaller test database.  The reason that you may want to do this is to use the explain feature to verify how the DB2 optimizer will access your data in production when testing queries in a much smaller test database.  For those of you who do not back up your databases because you can easily recreate the data, I highly recommend running the db2look command regularly to copy your DDL to a file.  Should something bad happen to your database, you can use the db2look output to recreate your database structure that is ready to have the data loaded again.  If you do this then you don’t need to try to apply the original DDL plus all modifications made over time and hope that you get to the newest structure.

 

Replication

Homogenous SQL Replication is free with all editions of DB2.  Replication is typically used to create a copy of a database on another system and keep the target up to date with all changes occurring in the source.  Homogenous SQL Replication is useful for this function.  There are also other replication tools available that can be used to replicate data between DB2 and other databases.

 

Federation (Fee)

With IBM’s various Federation products, you can write SQL that selects data from a remote source and inserts it into your target database.  Depending on the type of federation you have, you can even pull data from non IBM databases and even non-relational sources.  With federation you can use the “Load from Cursor” option described above, or write your own inserts and updates.  For more information about how federation works, please see my article on Virtual Databases.

 

Data Archiving (Fee)

Our newly acquired Optim Data Growth Solution (“Archieving”) software allows you to extract production data with minimal impact on your production databases based on rules that you define and put it into compressed files that are on cheaper disk or even off line storage.  Further, using ODBC or JDBC access your can continue to have access to data for historical audit or reporting purposes.  You can also tell the achieving process to automatically load the achieved data into another relational database.  Typically this database will be on older and cheaper hardware, allowing your production system to operate at lightening speed because it only has data needed for operating purposes.   The data achieved will be referentially in tact.  The archive process will discover relationships defined in the databases and allow you to define additional relationships that are enforced by applications even across different databases.  For example, if you decide that you want to archive orders that were shipped and completed more that two years ago, then the archiving will also pull all associated customers and parts related to that order.  You can state that you want the orders deleted, but want the related part and customer records to remain in the database.

 

Test Database Management (Fee)

Our newly acquired Optim Test Data Management (TDM) software allows you to extract production data with minimal impact on your production databases and put it into one or several test databases.  It allows you to copy full production copies or subsets of data that are referentially in tact.  It will discover relationships defined in the databases and allow you to define additional relationships that are enforced by applications even across different databases.  For example, if you define a subset of orders by order date as your subset, then TDM will also pull all associated customers and parts related to those orders.  If you also have the masking capability, you can tell TDM to mask data like credit card numbers, names and addresses such that no one can use the test data for any mischief.  TDM allows you to ship test data to other countries and have less rigorous data security on your test data while passing audits easily.  You can tell TDM to save all of the information about how you want your various test databases created.  Once this is done, you can just recreate the test databases as needed or on a regular schedule.

 

Change Management Expert / Data Studio Administrator (Fee)

Data Studio Administrator, formerly known as Change Management Expert, is a tool that allows you to define changes to database objects and then it creates the complex scripts to implement those schema changes while preserving data.  One of the ways to define changes is to tell DSA to compare two databases and move the changed schema AND DATA from one database to the other.  I would never recommend purchasing this big hammer for moving data around, but I say, “Smoke ‘em if you got ‘em.” 

 

These are all of the utilities that I could think of that could be used to move data.  I hope that you find the list useful and if you think of others, please e-mail me.  I would like to have a complete list.  Also if you have any questions at all, feel free to write to me at dean@db2dean.com.

 

 

HOME | Search