Move or Copy DB2 Data
20 January 2009
Updated 2 January 2013
Probably one of the subject areas that people ask me about most often is how to move their data from one system to another. Some want a copy of production data on a test system. Others need multiple copies of test databases so that different groups of developers can test without interfering with each other. Still others need to copy a tablespace or group of tables from one database to another. The good news is that there are several options and some will be better than others depending on what you are trying to accomplish. This list is intended to help you narrow your choices for moving data. Once you narrow the alternatives you will probably want to read more in depth about them in the Data Movement Utilities Guide and Reference manual or other links noted below. Several of these methods for moving data do more than what is described here, but I mostly just describe how they are used to move data.
One of the easiest ways to copy a full database from one platform to another is to back it up from the source and restore it to another. If the directory and/or drive letters are identical on the target and source servers then you just need to perform the backup and restore steps. However, if they are different then you need to perform a redirected restore. In a redirected restore, you create a script to tell DB2 where you want to put the files under each tablespace using the SET TABLESPACE CONTAINER command. This needs to be done for each tablespace. The really good news for those of you using DB2 v9.1 and above is that you can tell DB2 to generate the SET TABLESPACE CONTAINER commands from the backup file. Once the script is generated you just need to change the directory names to correspond to the directories on the new server. You can see a step by step example of doing a redirected restore at Cloning DB2 Databases Using Redirected Restore.
The process will also differ depending on whether you can do an off line backup or if you must use an online backup because the source system processing may not be interrupted. You can restore a backup taken on a different Operating System in some cases and not in others. Here is a link to the possible inter-OS backup/restores that are allowed: Restoring between OS. You can restore online and offline backups between systems with identical versions of DB2. For offline backups only, you can restore to DB2 instances running the same or a newer version of DB2 than the one that you backed up. For example, I can restore online and offline backups when my target and source are both v9.1, but if I backup a database on a v9.1 instance and want to restore it to a system running v9.5, then that backup must be taken offline. You can never restore a backup taken on a newer version to a system running an older version of DB2. When you restore an online backup you must have all of the necessary transaction log files or the restore and roll forward will not complete and you will not have a useable database. Where you can, it is a good idea to use the “INCLUDE LOGS” parameter in the online backup. This will ensure that you have at least all of the transaction logs that you need to recover the database to a useable state.
With these utilities, you can EXPORT data from a table and then use the IMPORT or LOAD utility to insert the data into a table in the same or a different database. The export and import utilities use SQL to extract and add data. The load utility is often faster than the import because it bypasses the DB2 SQL engine and builds physical database pages and puts them directly into the tablespace. These utilities can be used to move data between databases on any operating systems and versions. You can also use them to add data in files to a database. While the load command is often much faster than the import utility and can load very large amounts of data to a table quickly, there are some aspects of the LOAD command that you need to watch. If a table that you load has referential constraints with other tables, then you will usually need to run the SET INTEGRITY command after loading your tables. This command verifies that the referential and all other constraints are valid. If there are constraints and you do not use the SET integrity command, then the loaded table and its children may not be accessible. You should also understand the COPY YES/NO parameters of the LOAD command. Depending on which you choose, the tablespace that contains the table that you loaded could be placed into “backup pending” status and no access to any of the tables in that tablespace will be allowed until a backup is done. These utilities allow you to copy data between different versions of DB2 as well as databases on different Operating Systems.
LOB and XML Considerations
Large Objects (LOBs) and XML columns can be imported, exported and loaded without using the special LOB and XML parameters, but with significant limitations. For LOB data, columns are limited to 32K in these utilities if you don’t use the LOB parameters. When exporting tables with columns having the XML data types, the XML columns are always exported to separate files from the regular relational data. Before using any of these utilities or the db2move command, it would be wise to read and understand how these utilities handle these special data types. This information is in the Data Movement Utilities Guide and Reference manual.
The db2 move utility is used to copy several tables at once. With the export, import and load utilities you must create a script for each table that you want the utility to process. The db2move utility automates that process for moving all or a large subset or tables in your database. You start by running the db2move command on the source system to extract the source data. It will use the export command for each table specified. The command will create a file or set of files for each table in a directory that you specify. It also creates a file with a list of tables exported. You then copy all files from this directory to the target system if the source and target databases are on different systems. Finally you run the db2move command on the target system with a parameter specifying load or import depending on how you want the data placed into the database. There are several options for specifying the tables that you would like exported including all tables in the database, a list of table names, a list of tablespaces, a list of schemas, and so on. This option allows you to copy data between different versions of DB2 as well as databases on different Operating Systems.
Load from Cursor
The CURSOR option in the load command allows you to connect to another database and have the load utility select data from a table on the source and load it directly into the target. One very nice attribute of this process is that you do not need to have space for an intermediate file in which to extract the table data, because the data is loaded as it is extracted. Please note that if your source database is DB2 on i5/OS or z/OS then you must be licensed for DB2 Connect. This option allows you to copy data between different versions of DB2 as well as databases on different Operating Systems.
This command allows you to move or copy a database to a new system or a different instance in the same system by copying the underlying files and directories (containers) to the new location. You use operating systems to copy the database files that contain the database and tablespace data and then use the db2relocatedb command to tell the new instance about the database. This does not work if you are trying to move a database to an instance that already has one or more databases. This utility can also be used to rename a database and to rename files used by the database or tablespace. This command only allows you to move databases between identical Operating Systems and DB2 Versions.
This utility allows you to create a new database with the same structure as the old one, but it does not copy data. It generates a DDL script that in a human readable format that can be run to create an empty database that has the same structure as the source. The file can be modified by the user as well. It can also be used to copy production database statistics to a much smaller test database. The reason that you may want to do this is to use the explain feature to verify how the DB2 optimizer will access your data in production when testing queries in a much smaller test database. For those of you who do not back up your databases because you can easily recreate the data, I highly recommend running the db2look command regularly to copy your DDL to a file. Should something bad happen to your database, you can use the db2look output to recreate your database structure that is ready to have the data loaded again. If you do this then you don’t need to try to apply the original DDL plus all modifications made over time and hope that you get to the newest structure.
Homogenous SQL Replication is free with all editions of DB2. Replication is typically used to create a copy of a database on another system and keep the target up to date with all changes occurring in the source. Homogenous SQL Replication is useful for this function. There are also other replication tools available that can be used to replicate data between DB2 and other databases.
With IBM’s various Federation products, you can write SQL that selects data from a remote source and inserts it into your target database. Depending on the type of federation you have, you can even pull data from non IBM databases and even non-relational sources. With federation you can use the “Load from Cursor” option described above, or write your own inserts and updates. For more information about how federation works, please see my article on Virtual Databases.
Data Archiving (Fee)
Our newly acquired Optim Data Growth Solution (“Archieving”) software allows you to extract production data with minimal impact on your production databases based on rules that you define and put it into compressed files that are on cheaper disk or even off line storage. Further, using ODBC or JDBC access your can continue to have access to data for historical audit or reporting purposes. You can also tell the achieving process to automatically load the achieved data into another relational database. Typically this database will be on older and cheaper hardware, allowing your production system to operate at lightening speed because it only has data needed for operating purposes. The data achieved will be referentially in tact. The archive process will discover relationships defined in the databases and allow you to define additional relationships that are enforced by applications even across different databases. For example, if you decide that you want to archive orders that were shipped and completed more that two years ago, then the archiving will also pull all associated customers and parts related to that order. You can state that you want the orders deleted, but want the related part and customer records to remain in the database.
Test Database Management (Fee)
Our newly acquired Optim Test Data Management (TDM) software allows you to extract production data with minimal impact on your production databases and put it into one or several test databases. It allows you to copy full production copies or subsets of data that are referentially in tact. It will discover relationships defined in the databases and allow you to define additional relationships that are enforced by applications even across different databases. For example, if you define a subset of orders by order date as your subset, then TDM will also pull all associated customers and parts related to those orders. If you also have the masking capability, you can tell TDM to mask data like credit card numbers, names and addresses such that no one can use the test data for any mischief. TDM allows you to ship test data to other countries and have less rigorous data security on your test data while passing audits easily. You can tell TDM to save all of the information about how you want your various test databases created. Once this is done, you can just recreate the test databases as needed or on a regular schedule.
Optim Datbase Administrator (FREE)
Optim Database Administrator, formerly known as Data Studio Administrator and Change Management Expert, is a tool that allows you to define changes to database objects and then it creates the complex scripts to implement those schema changes while preserving data. One of the ways to define changes is to tell DSA to compare two databases and move the changed schema AND DATA from one database to the other. The features of this tool have now been rolled into the free IBM Data Studio.
*** 16 April 2010 Additions Follow ***
IBM Data Movement Tool (FREE)
The IBM Data Movement Tool allows you to copy data from both DB2 and other relational databases into DB2. It is a great tool for copying extremely large volumes of data and the instructions for the tool contain a section for optimizing the movement of large amounts of data. It can be run from the command line or from a GUI that allows you to pick the tables that you want to copy. You can easily copy an entire database or selected tables. It connects to both the target and source databases and can create the tables in the target database for you based on what is in the source database. This is a great tool if you need to copy a lot of data and don’t have room anywhere to stage the data in a file. It connects to both the DB2 target and the source and loads the data into the target as it is being extracted from the source. Source databases include DB2 (any supported version of LUW, I and Z), Oracle, SQL Server, Sybase, MySQL, PostgreSQL, and Access. When copying data from an Oracle database, you may want to consider first upgrading the target to DB2 v9.7 because that version has the compatibility feature and the DB2 target tables will more closely resemble the Oracle tables when the copy is done. The link at the beginning of this paragraph has all of the instructions for using the tool and a link for downloading the tool. The instructions primarily talk about copying data from non-IBM databases, but it also a great tool DB2 to DB2 movement too. This tool was written by Vikram Khatri. Please see his db2ude Blog for tons of great DB2 information.
Optim Development Studio (FREE)
Optim Development Studio allows you to connect to two databases at once to copy data from one database to another with a very easy to use copy and paste GUI. You just select the tables from one database from a list and click “copy”, then select the target database and click “paste”. The tool will create the tables in the target database if they are not there already and then copy the data. You can also choose to only create the table if you don't want the data. This is a great tool for copying some test tables from one database to another, but is not ideal for transferring large volumes of data because all data must flow through your workstation. The target database must be DB2 v9.7, but sources include Oracle and any DB2 version running on any platform (LUW, I, Z). The features of this tool have now been rolled into the free IBM Data Studio.
HADR Reads on Standby (Fee)
Standard DB2 HADR is a high available feature where you can keep an up to date copy of DB2/LUW on a standby server that is only accessible after you tell the secondary to take over from the primary. However, as of DB2 v9.7 fix pack1 the DB2 HADR Reads on Standby option lets you use the standby in a read only mode. This is a great way to use your disaster recovery database for things like ad hoc queries or reporting without impacting the performance of your primary system.
*** 28 August 2010 Additions Follow ***
Move a Table On Line (FREE)
Thewas introduced in DB2/LUW v9.7 and allows you to move a table to a different tablespace while the table remains on line! You can also use it to move a table off line if you want to have less overhead and logging. This procedures moves all of your indexes and other objects as well and has features to make the move as fast as possible. You can make some alternations to the table such at the amount of storage for DMS tables. Use case include changing the amount of space needed for a table, moving an important table to its own tablespace, changing the page size for the table or moving a table from a regular to a large tablespace to enable new features. This page includes of using the procedure.
*** 2 January 2013 Additions Follow ***
Ingest Utility (FREE)
The new INGEST utility that arrived in DB2/LUW 10.1 incorporates the best features of both the LOAD and IMPORT utilities and adds some new data import features as well. Like the LOAD utility it is fast, although not as fast as that utility. Like the IMPORT utility it checks constraints and fires triggers so there is no need to use the SET INTEGRITY facility after running it, plus the table being loaded is completely available during loading. It also adds the capability to do continuous loading of data so if you constantly have data arriving you can set up the utility to continuously add data to your table. Further, you can use it to not only insert data, but to update rows, merge data (“upsert”), and even delete rows that match records in the input file. See the INGEST developerWorks article for lots of great examples. In my opinion, this utility makes the IMPORT utility obsolete and takes over most of the use cases for LOAD.
Move Tablespaces Between Storage Groups
DB2/LUW added a feature to the ALTER TABLESPACE command that allows you to change the storage group for a tablespace. By merely altering the storage group for the tablespace, DB2 will automatically move the data to the new set of directories for you without any outage. DB2 uses the same redistribute facility to move to the new storage group as it does when you add a new container to a tablespace and DB2 redistributes the extents among the new and existing tablespaces. No downtime is needed and very little additional log space is consumed. DB2/LUW v10.1 introduced an object called a Storage Group. The storage group is merely a feature of Automatic Storage that allows you to define a name for one or more file systems. You can then build one or more tablespaces on a storage group or move tablespaces among the storage group. This feature was added to support Multi Temperature Storage that allows you to age data in a partitioned tablespace through different tiers of storage as the data ages, but you can use it to move any table space. It is now quite easy to move a DMS table from regular storage to automatic storage once you have upgrade to DB2 v10.1 by following these instructions.
These are all of the utilities that I could think of that could be used to move data. I hope that you find the list useful and if you think of others, please e-mail me. I would like to have a complete list. Also if you have any questions at all, feel free to write to me at firstname.lastname@example.org.