Database Federation or Replication
17 April 2009
Federation and Replication are topics that are easy to confuse especially for those who are just starting to investigate them. Sometimes even experienced customers can get them confused. This situation is not helped when you consider that our oldest replication tool uses federation under the covers. Because of this a lot of the replication documentation talks about configuring federation.
At a high level, federation and replication are tools that IBM provides that promote our Information on Demand strategy of helping customers view data wherever it is in the organization and to copy it to the most convenient locations if desired.
To help eliminate any confusion, I will define federation and then replication to help make the choices clear.
IBM’s federation tools essentially put "DB2 Glasses" on the world. By this I mean that to use federation, you to connect to a DB2 database (possibly and empty one) and from there you can see tables, views and other objects in other databases as if they are in that one DB2 database. Those objects can be in other IBM databases, non IBM relational databases and some non-relational data sources such as XML files. To allow users to view the foreign objects, the administrator creates objects called "nicknames" in that DB2 database that are presented to the world just as tables and views are in that database. A nickname is just like a view -- the difference being that a view is built on a table in the local database and a nickname is built on a table in another database. Even if you only buy Infosphere Federation Server it installs the DB2 database software as part of the installation of federation. After installation you create a DB2 database and use DB2 commands to configure federation. You also get "Homogenous Federation" with several DB2 database server editions and most editions of DB2 Connect. Homogenous federation allows you to federate to any IBM relational database including DB2 and Informix. You always need DB2 Connect on ASCII based database servers, such as Windows, to federate to iSeries and zSeries databases. Heterogeneous Federation (called InfoSphere Federation Server) allows you to federate non IBM databases such as SQL Server and some non-relational sources such as XML files and JDBC sources.
For more information about Federation and what you can do with it, please see my earlier article entitled Virtual Databases.
In its simplest form, replication comprises tools to capture changes on the source database and then transport and apply them to a target database. We have three database replication tools – SQL Replication, Q-Replication, and Infosphere Change Data Capture (CDC). SQL Replication has been around for many years and used to be called Data Propagator. Q-Replication uses MQ to queue the changes and transport them to their destination. CDC is our newest tool and is probably the most robust. You can see descriptions of each replication tool and a table comparing their features in Sean Byrd’s article, Replication Options for DB2 and Beyond. These tools can replicate between various IBM and non-IBM databases. You do not need any DB2 or Informix databases to effectively use our replication tools so you can use them to replicate data from one non-IBM database to another non-IBM database.
All of these tools have a component that is installed on the source database server to capture changes to tables you chose to be replicated. For most sources these tools will watch the transaction logs (journals) to capture changes. This means that they typically consume few resources on the source server. For some databases they are not able to access the logs and must create triggers on the source database. This will usually cause a larger load on the source system.
All of these tools also have another component that applies the changes to the target database. This component is typically installed on a different server than the source database, but does not necessarily need to be installed on the server with the target databases. Sometimes there will be several heterogeneous targets, so the apply component will be on some central server that can connect to all of the targets. You can think of the apply component as any other application that connects to the source database and issues SQL to select, insert, update, and delete data from tables in the source. Since it is just like any other application, the server with the apply component also needs to have the database client software installed that is necessary for any application to connect to the target databases. The apply component works by receiving changes from the capture component on the source database server, and using a connection to the targets to apply the changes to them. To see which databases each tool can capture and apply with, please see Sean Byrd’s article, Replication Options for DB2 and Beyond.
Homogenous Replication comes with most DB2 editions on Linux, UNIX and Windows and also with most DB2 Connect editions. Homogenous Replication allows you to replicate between IBM relational databases, including DB2 and Informix, using SQL Replication. Once you install your DB2 database server software and create a database you can begin using homogenous replication with other Informix and DB2 databases. Once DB2 is installed and your database is created, then your next step is to tell DB2 to connect to the source database. How does it make that connection? It uses Federation! This fact can be somewhat confusing. SQL replication relies on nicknames to connect to the source database. You can use the “Replication Center” feature of the DB2 Control Center to set up replication. If you want to replicate from DB2 on “IBM i” or zSeries then you will need additional products including DB2 Connect and replication tools on those servers.