IOD 2013 Summary

db2Dean’s 2013 IOD Highlights

Dean Compher

26 November 2013

While I didn’t see anything or any celebrity interesting enough to take another selfie with this year, I was able to learn lots of great new stuff at the IOD sessions and upgrade my DB2 certification to 10.5. There was tons of information and I’ll give you a summary of what I learned along with many random tips and tricks. This year I concentrated more on Big Data, but still had time to get to some great DB2 sessions. If you attend the conference as you should have, then you can download many of the presentations from the Conference Presentation Search Page. As time has gone on many more have been added, so try again if something you wanted was not available on the site when you first got back. I look forward to seeing you at the conference next year! Also please add anything else you though was really interesting to my db2Dean Facebook Page or to the “Message Board” section of my db2Dean Community page.

Big Data

Big data is getting a lot of attention lately so I went to a number of sessions on this topic. Big data does not seem to have any one definition in the industry as far as I can tell so I’ll give my opinion on what it is. It is the capability to store a large quantity of data and the ability to analyze that data efficiently. The data can be in several forms including structured relational data, unstructured data, NoSQL data and large files. This data can be in traditional databases, Hadoop clusters and other platforms. The ability to analyze the data will depend on a set of tools since no one tool is going to be able to analyze all data formats. IBM provides a set of tools to store both your non-relational (think Hadoop clusters) and relational data (databases hold the relational stuff) and several different tools to analyze and use the data. Several of these tools are included in the Big Insights offering. Here I’ll give some of the session highlights about various tools for Big Data:

Hadoop Cluster Software. Explaining Hadoop is much too big to discuss here, but there is much written about it. It is open source software. Big Insights provides this open source software and has some add-ons that make Hadoop more enterprise-friendly and makes it easier to implement, more highly available and more secure.
Analysis Tools. Big Insights provides or allows you to add-on a number of tools to make it easy for you to analyze data in a Hadoop cluster including open source and proprietary tools to analyze different types of data including text, social media feeds, sensor data and other types of data. These tools either eliminate the need to write Map/Reduce code or make it easier to write custom code. Also there are more traditional tools like Cognos for analyzing structured and unstructured data. One of big use cases is finding meaningful patterns in your data such as when fraud might be occurring.
Streams. InfoSphere Streams analyzes data as it moves through your system. It can process data as it moves from mechanical sensors, social media feeds, incoming files and other sources as that data is received. Processing can include early recognition of patterns and alerting some one of the development of that pattern. For example, a bank might be able to see that a certain pattern of credit card use is forming across the world that indicates a certain type of fraud is beginning to occur and alert people who can do something about it.
Data Explorer. This tool can craw through an enterprise’s data stores including all relational databases, Hadoop clusters, file servers and other stores to index what is there and allow fast search across all of those sources. This is much like the way your favorite Internet search engine crawls the internet, indexing web pages and allowing fast search even though the search topics are not known in advance. It also allows the enterprise to determine where their data is located.
BigSQL. It allows developers who already know how to write SQL to query certain types of data in a Hadoop clusters using SQL. This SQL access is provided through JDBC/ODBC drivers. Behind the scenes it generates map/reduce code to read the Hadoop files.
Hadoop/DB2 interfaces - Allows you to easily move data between Hadoop and DB2.
BigSheets. Provides a spread-sheet like interface to Hadoop data and allows the user to easily view data and to export data and reports without doing any programming.
SPSS. Helps you use all of your data to predict what is likely to happen to allow you to make better decisions.
A number of sessions did not focus on particular tools, but instead described how different enterprise solved problems with big data. So if you want to get more of a big picture of how big data is being used, then I highly recommend that you attend IOD next year.
IBM Big Data tools are designed to be used on any vendor’s databases and the major vendors’ Hadoop clusters.

BLU Acceleration

BLU Acceleration is the flagship new feature in DB2 10.5 that can improve the performance of analytical queries by orders of magnitude. Since it only changes the implementation and processing of tables, no SQL or coding changes are needed in applications to use it. It is a combination of technologies that can make queries run much faster. These technologies include storing data by column instead of by row which makes for better compression and more efficient I/O, new caching algorithms to give in-memory speeds even when all data does not fit into memory, recognizes and exploits processor technologies to get more work done in each CPU tick, can even skip searching groups of rows that do not help the query. Since it does not use indexes, MDCs, MQTs or other objects it is great when you don’t really know what queries will look like. To learn more about this amazing technology, please read chapter 3 of the DB2 10.5 with BLU Acceleration book. You can even try if for free through about mid-February by registering for BLU for Cloud.

DB2 10.5

Of course some of the biggest news at IOD this year was about the newly released DB2 version 10.5. I discussed a number of facets of the new release in my DB2 10.5 article, but I learned a number of new items at the conference. Some of these features are in all editions and some are only in certain editions. To see if the feature that interests you is in your favorite edition see the Functionality by Edition page. Here are some of the more interesting ones, some of which have been in earlier versions:

The packaging/editions of DB2 have changed in DB2 10.5 and InfoSphere Warehouse (ISW). We have now introduced DB2 Advanced Work Group Edition that contains nearly all of the optional features of DB2. You can read about what you get in each of the additions in the 10.5 Editions Page.
IBM does not offer the InfoSphere Warehouse editions anymore. They have been replaced with the DB2 Advanced Editions.
After you have installed any edition of DB2 (except the expresses) you can switch to any other edition by just replacing the license key. For example if you are running DB2 Enterprise Server Edition 10.5 and you want to switch to DB2 Advanced Workgroup Server Edition 10.5, then you merely change the license key using the db2licm command and you will then be running AWSE.
As of DB2 10.1 fp2 and later, you can use the ADMIN_MOVE_TABLE procedure to move tables that have foreign keys.
Use the SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO table function to estimate the space savings that you could get for a table or all tables in a schema if you were to use compression.
Use the command “db2pd -tablespaces trackmodstate” to see if any pages have changed in the tablespace since the last backup. The TRACKMOD configuration parameter must be on for this information to be gathered.
If you add new partitions to a table by loading a new table first that has compression turned on and then adding or attaching it to the partitioned table, then you can have a different static compression dictionary for each partition. So in cases where your newest rows go into the newest partitions, using table partitioning is a way to keep you static compression from degrading over time.
Don’t forget about the “RECLAIM EXTENTS” option introduced in 10.1 to the REORG command that allows you to do a lightweight reorg that just gives unused extents from tables and indexes back to their respective tablespaces.
As of 9.7 you can create Reclaimable Storage Tablespaces that can be easily reduced in size using the ALTER TABLESPACE command without worrying about the high water mark. If you have upgraded from an earlier version, you need to move tables from old tablespaces to new ones, possibly using the ADMIN_MOVE_TABLE() procedure, to be able to use this feature.
DB2 10.1 also introduced archive log compression in all editions using the LOGARCHCOMPR1 (2) DB CFG parameters. There are few cases where this should not be used.
DB2 10.1 introduced system period temporal tables so don’t write a bunch of triggers if you need to keep a history of changes to any of your tables.
Remember the db2cos and db2fodc commands when you have problems like lock timeouts, system hangs and other problems.

Upgrading to DB2 10.5

Melanie Stopfer gave a great presentation on how upgrade to DB2 10.5. For lots of other good upgrade information see the DB2 Upgrade Portal. The bullets here on this topic are just some highlights from her presentation:

If you are currently running on DB2 9.5 or earlier, you will need to upgrade to 9.7 or 10.1 before you can upgrade to 10.5.
IBM does not offer the InfoSphere Warehouse (ISW) editions anymore. They have been replaced with the DB2 Advanced editions. If you are currently running ISW then your upgrade path is to go to DB2 10.5. You can upgrade ISW v9.7 or later versions to DB2 10.5 directly. Depending on the edition of ISW you are using, you will either see DB2 Advanced Workgroup Server Edition or DB2 Advanced Enterprise Server Edition show up in your Passport Advantage account as a replacement.
Use the db2batch utility to benchmark your important queries before and after the upgrade to ensure that you have not caused any problems.
A number of operating system versions are no longer supported. Make sure to view the systems requirements page in the DB2 10.5 information center before starting your upgrade.
You can run the “db2prereqcheck –v 10.5.0.0” and the db2cupgrade commands to verify that your system and instance is ready to be upgraded. This command is run automatically when you upgrade an instance but it is nice to know before hand if you can upgrade.
After upgrading instances on Linux and UNIX systems, you can run the db2val command to validate your instances.
You will need get the 10.5 license key (Activation File) and apply it after the upgrade.
An SSH sever will be installed and a service will be created for it when you install DB2 10.5 on Windows. This allows Data Studio to perform certain commands like starting an instance from a remote client without using the DAS.
You will need additional disk space during the upgrade and the SYSCATSPACE and TEMPSPACE1 table spaces will likely need to be enlarged. See the Upgrade Portal for more information. You will also need to increase logging parameters.
It is a good idea to clean out the diaglog path before starting your upgrade so it is easy to see what diagnostic files came from the new version.
The UPGRADE DATABASE command now has the –REBINDALL parameter to rebind all of your packages during the upgrade.
Instead of binding the various client versions you may have from those clients, you can just download all of client bind files to your database server and bind them there. This way you don’t have to worry about clients at supported versions because they are not bound to your database. You can find them at the bind file site.
It is a good idea to upgrade the explain tables after upgrading the database so that people can continue to explain their queries. Use the db2exmig command to do this.
If your database started life as an older pre-9.7 version, it is a good idea to verify that all of your DMS and automatic storage tablespaces are LONG tablespaces. You can determine this by using the SYSCAT.TABLESPACES.DATATYPE catalog view and then alter the tablespace with the CONVERT TO LARGE parameter to change non-long tablespaces to long. LONG refers to the number of bits used in the RID and not the type of data stored in it.
If you upgrade from a pre-10.1 database and are using compression make sure to alter your compressed tables using the COMPRESS YES ADAPTIVE go begin using adaptive compression in addition to the static compression already being used.

IBM Data Studio

With the release of DB2 v10.1 there is no more Control Center and you should start using the IBM Data Studio. The Client Configuration Assistant is also gone. Nearly all functions of control center are now in Data Studio plus it has several others such as a procedure builder and debugger and a feature to generate scripts to change database schema while preserving data. For a summary of the tool, please see my series of articles starting with Data Studio Update Part 1 or see my Data Studio Web Console article to see how to do health monitoring and task management. A number of improvements have been made with the release of Data Studio 4.1 including being able to administer the new features of DB2 v10.5 like BLU.

***

I hope that you found at least a few pieces of information in this article to be new and useful. I hope to see you at the conference next year.

HOME | Search