Archive Old Data from Host Systems
Dean Compher
31 October 2014
The host systems of many organizations are clogged with hordes of old data that is only occasionally used if used at all. This data uses up resources on your tier-1 host systems, but provides little value. Generally it is there just in case someone really needs it. In this case I am considering host systems to be IBM System z (zSeries) and IBM i (iSeries) systems. In this article I will describe how you can move that old data into less expensive distributed systems while still providing access to it from your favorite report writer or custom application through ODBC or JDBC SQL access.
One thing that is very important when you archive data out of your primary databases is to copy important related data with the data being archived so that when your user queries that data they can see it in context. For example, in an order entry system, you may want to archive orders over 2 years old. When someone goes to look at an old order it may also be very important to see the customer who placed to order and where it was shipped. Therefore, you would want to include that additional information in the archive so that those objects can be joined in a query even though some of the data is not deleted from the production system. That is, even though you are deleting old orders, you still want the customer records in the production system for the next time the customer orders something. Optim Archiving is great because it allows you to archive a “complete business object” in an archive and allows you to query that data using SQL through an ODBC or JDBC connection. Just the data that should be deleted is deleted from the source tables. It also allows you to archive many types of data including DB2 data from zSeries and iSeries as well as several other types of data sources on those systems including IMS, VSAM, flat files and non-IBM databases. This is really useful, because it allows you to query all of this data in the same way once it is archived.
The primary components of Optim archiving that get installed on your distributed server include the Optim Server and Optim Connect as shown in Figure 1. The Optim Server is the software component that connects to the host systems, collects the data to be archived and then deletes the data that should be deleted. The data queried is placed into a set of archive files. For DB2 and some other host sources, the Optim Server appears to be just any other application that connects to them and issues SELECT, DELETE and other SQL queries. The Optim Connect component listens on a port like any database management system like DB2/LUW or Oracle and allows queries through an SQL interface to the data in the archive files via ODBC and JDBC connections. Standard SQL SELECT queries are allowed. Data can be selected, and joined just as you would do from other database servers. It should be noted that Optim Archiving can be installed on System z, but the focus of this article is to completely relieve your host systems of the old data, so that will not be discussed here.
Figure 1. High Level Architecture
Only the primary components needed to archive and data and allow it to be queried are shown in figure 1 with other necessary components not shown. For example, InfoSphere Federation Server is needed to connect to DB2 on iSeries and a DB2 Connect license is needed to connect to System z. Federation Server is included with Optim Archiving and licensed to be used to connect to IBM i. For more information on configuring Federation Server please see my article, Using Federation Server with Optim. However, you need to provide your own DB2 Connect license to connect to DB2 on System z. For more information on configuring DB2 Connect please see my DB2 Connect Usage and Editions article. Also the Optim Directory database is not shown. This is a database where Optim keeps all of configuration information and information it knows about the external databases such as the relationships between tables. You can use many major databases, but a license for DB2/LUW is included for this purpose and it can be installed locally with the Optim Server.
DB2 on System z Considerations
The first consideration for allowing a connection to DB2 on z/OS is that you need to have DB2 client software installed. If you install the DB2/LUW server software that comes with Otpim to have a local Optim Directory then you will get the IBM client software automatically. If you are using anther database or a remote DB2 database for your Optim Directory then you will need to download and install the IBM Data Server Client to allow your DB2/zOS connection. To enable to the client to connect to DB2/zOS you will need to apply the DB2 Connect license unless you are using the host-side DB2 Connect Unlimited License. This does not come with Optim and you will need to provide your own DB2 Connect license. You may be able to get the connection working using other components that come with Optim, but if you do not own the appropriate license of DB2 Connect then you will be out of compliance with your license agreement. Many System z shops already own DB2 Connect Unlimited Edition and those that do have all the licenses they need.
As noted above, the Optim Directory is a database where the Optim Servers keeps configuration and other information. This database can be local on the server with the Optim Server or it can be a remote database to which the Optim Server connects using that database vendors’ client software. The Optim Directory can be on a number of different types of databases, but you should keep in mind that you get a DB2/LUW license at no extra change when you get Optim so that is frequently the best choice. If you use DB2/LUW for your Repository database when archiving DB2/zOS data you need to determine whether you will be archiving data from UNICODE or non-UNICODE tablespaces (i.e. EBCDIC) before creating your Optim Directory database. If you are archiving UNICODE tablespaces then you will need to create a UNICODE DB2/LUW Optim Directory database and if you are archiving EBCDIC or ASCII tablespaces then you will need to create a non-UNICODE DB2/LUW database. For non-UNICODE databases I suggest using the default operating system code page when creating the database for the Optim Directory. For example, if you are creating the DB2/LUW databases on Windows then use the 1252 code set. The Supported Code Page Knowledge center page, shows the code pages for each operating system for each territory. Figures 2 and 3 show examples of creating DB2 LUW UNICODE and non-UNICODE DB2/LUW databases for use as Optim Directories
Figure 2. Create a UNICODE DB2/LUW Database.
CREATE DB OPTDIRU USING CODESET UTF-8 TERRITORY US WITH “unicode db”;
Figure 3. Create a non-UNICODE DB2/LUW Database with the Windows Codeset.
CREATE DB OPTDIRN USING CODESET 1252 TERRITORY US WITH “non-unicode”;
If you have both UNICODE and non-UNICODE DB2 tablespaces you wish to archive and you are using a DB2 Optim Directory database, then you just need to create two Optim Directory databases. Within Optim you just pick the one you need when creating archive requests. When configuring Optim to use your non-UNICODE DB2/LUW database for an Optim Directory, you will get the warning shown in Figure 4. You just need to check the check-box and ignore it.
Figure 4. Warning about non-UNICODE Optim Directory
If you do manage to use a DB2/LUW database with the wrong encoding for the type of tablespace being archive then you will likely see an error similar to the following when you run your archive request, “Column SALESMAN_ID in table TSTDB208.OPTUSR1.OPTIM_SALES uses a data type that is not supported in this database.” If you are using a non-DB2 database for your Optim Directory, then you do not need to be concerned with the encoding of the Optim Directory database.
DB2 on IBM i Consideration
For the archiving of DB2 on IBM i databases you do not need DB2 Connect, and all components you need to connect are included with your Optim license. You do not connect the Optim Server directly to DB2 on IBM i databases. Instead, you create a federated database in DB2/LUW using Infosphere Federation Server. In this federated database you create objects called “nicknames” that are like views, but they point to tables in the DB2 on IBM I databases. Optim connects to the federated databases and sees these nicknames just as it sees tables and views in any other database. For more information on how to configure your system please see my Using Federation Server with Optim article.
Connecting to Optim Connect
Once you have begun archiving files from your host system, you will want to query them. As noted earlier in this document the Optim Connect server is the component that listens on a port (default is 2551) for ODBC and JDBC connections to allow SQL SELECT statements to the data. You can not update data in archive files through Optim Connect or any other means. Optim Connect also allows connecting other relational databases. While the installation and configuration of Optim Connect is beyond the scope of this document, I did want to mention what you need on your client to allow connections. When you install Optim on the server as shown in Figure 1 above, the drivers and instructions for the drivers will be placed into a directory on the server. I installed Optim 9.1 on Windows and I found my drivers in “C:\IBM\InfoSphere\Optim\installer\Optim Connect\Thin Clients”. I got there by searching for the directory, “Thin Clients”. Under that directory you will find two directories, “Java Thin Clients” for JDBC and “ODBC Thin Clients”. Each directory will have the install files and a PDF document with the instructions on how to install and configure them on your client. Please note that there are ODBC drivers with versions 2.5 and 3.5. User 2.5 for 32-bit databases and 3.5 for 64-bit.
I downloaded the ODBC client, installed and configured it according to the instructions. I don’t know a lot about ODBC so I didn’t create a source. It is probably a good idea to do that though. Instead, I just configured the source from within Excel. I was then easily able to connect to and Optim Connect Server running in the cloud and query my data. Since Optim extracts the metadata along with the data from the host databases like table name, column name and data type, it was easy to see the data in context. Better reporting tools like IBM Cognos can also be used to perform the queries.
For more details on configuring and using Optim Connect please see my Connecting to your Optim Archive Files article.
***
Once you have configured Optim and other needed components, you create your archive requests in Optim just as you would for any other databases. Please add any thoughts that that you on this topic to the Facebook Page or the db2Dean and Friends Community and help the extended community.