db2Dean’s 2013 IOD
Highlights
Dean Compher
26 November 2013
While I didn’t see anything or any celebrity interesting
enough to take another selfie
with this year, I was able to learn lots of great new stuff at the IOD sessions
and upgrade my DB2 certification to 10.5. There was tons of information
and I’ll give you a summary of what I learned along with many random tips and
tricks. This year I concentrated more on Big Data, but still had time to get
to some great DB2 sessions. If you attend the conference as you should
have, then you can download many of the presentations from the Conference
Presentation Search Page. As time has gone on many more have been added, so
try again if something you wanted was not available on the site when you first
got back. I look forward to seeing you at the conference next year!
Also please add anything else you though was really interesting to my db2Dean Facebook Page or to the
“Message Board” section of my db2Dean
Community page.
Big Data
Big data is getting a lot of attention lately so I went to a
number of sessions on this topic. Big data does not seem to have any one
definition in the industry as far as I can tell so I’ll give my opinion on what
it is. It is the capability to store a large quantity of data and the ability
to analyze that data efficiently. The data can be in several forms including
structured relational data, unstructured data, NoSQL data and large files.
This data can be in traditional databases, Hadoop clusters and other
platforms. The ability to analyze the data will depend on a set of tools since
no one tool is going to be able to analyze all data formats. IBM provides a
set of tools to store both your non-relational (think Hadoop clusters) and
relational data (databases hold the relational stuff) and several different
tools to analyze and use the data. Several of these tools are included in the Big Insights
offering. Here I’ll give some of the session highlights about various tools
for Big Data:
- Hadoop Cluster Software. Explaining Hadoop is much too
big to discuss here, but there is much written about it. It is open
source software. Big Insights provides this open source software and has
some add-ons that make Hadoop more enterprise-friendly and makes it easier
to implement, more highly available and more secure.
- Analysis Tools. Big Insights provides or allows you to
add-on a number of tools to make it easy for you to analyze data in a
Hadoop cluster including open source and proprietary tools to analyze
different types of data including text, social media feeds, sensor data
and other types of data. These tools either eliminate the need to write
Map/Reduce code or make it easier to write custom code. Also there are
more traditional tools like Cognos for analyzing structured and
unstructured data. One of big use cases is finding meaningful patterns in
your data such as when fraud might be occurring.
- Streams. InfoSphere Streams analyzes data as it moves
through your system. It can process data as it moves from mechanical
sensors, social media feeds, incoming files and other sources as that data
is received. Processing can include early recognition of patterns and
alerting some one of the development of that pattern. For example, a bank
might be able to see that a certain pattern of credit card use is forming
across the world that indicates a certain type of fraud is beginning to
occur and alert people who can do something about it.
- Data Explorer. This tool can craw through an enterprise’s
data stores including all relational databases, Hadoop clusters, file
servers and other stores to index what is there and allow fast search
across all of those sources. This is much like the way your favorite
Internet search engine crawls the internet, indexing web pages and
allowing fast search even though the search topics are not known in
advance. It also allows the enterprise to determine where their data is
located.
- BigSQL. It allows developers who already know how to
write SQL to query certain types of data in a Hadoop clusters using SQL.
This SQL access is provided through JDBC/ODBC drivers. Behind the scenes
it generates map/reduce code to read the Hadoop files.
- Hadoop/DB2 interfaces - Allows you to easily move data
between Hadoop and DB2.
- BigSheets. Provides a spread-sheet like interface to Hadoop
data and allows the user to easily view data and to export data and
reports without doing any programming.
- SPSS. Helps you use all of your data to predict what is
likely to happen to allow you to make better decisions.
- A number of sessions did not focus on particular tools,
but instead described how different enterprise solved problems with big
data. So if you want to get more of a big picture of how big data is
being used, then I highly recommend that you attend IOD next year.
- IBM Big Data tools are designed to be used on any vendor’s
databases and the major vendors’ Hadoop clusters.
BLU Acceleration
BLU Acceleration is the flagship new feature in DB2 10.5
that can improve the performance of analytical queries by orders of magnitude.
Since it only changes the implementation and processing of tables, no SQL or
coding changes are needed in applications to use it. It is a combination of
technologies that can make queries run much faster. These technologies include
storing data by column instead of by row which makes for better compression and
more efficient I/O, new caching algorithms to give in-memory speeds even when
all data does not fit into memory, recognizes and exploits processor
technologies to get more work done in each CPU tick, can even skip searching groups
of rows that do not help the query. Since it does not use indexes, MDCs, MQTs
or other objects it is great when you don’t really know what queries will look
like. To learn more about this amazing technology, please read chapter 3 of
the DB2 10.5 with BLU Acceleration book. You can even try if
for free through about mid-February by registering for BLU for Cloud.
DB2 10.5
Of course some of the biggest news at IOD this year was
about the newly released DB2 version 10.5. I discussed a number of facets
of the new release in my DB2 10.5 article,
but I learned a number of new items at the conference. Some of these
features are in all editions and some are only in certain editions. To
see if the feature that interests you is in your favorite edition see the Functionality
by Edition page. Here are some of the more interesting ones, some of
which have been in earlier versions:
- The packaging/editions of DB2 have changed in DB2 10.5 and
InfoSphere Warehouse (ISW). We have now introduced DB2 Advanced Work
Group Edition that contains nearly all of the optional features of DB2.
You can read about what you get in each of the additions in the 10.5
Editions Page.
- IBM does not offer the InfoSphere Warehouse editions anymore.
They have been replaced with the DB2 Advanced Editions.
- After you have installed any edition of DB2 (except the expresses)
you can switch to any other edition by just replacing the license key.
For example if you are running DB2 Enterprise Server Edition 10.5 and you
want to switch to DB2 Advanced Workgroup Server Edition 10.5, then you
merely change the license key using the db2licm command and you will then
be running AWSE.
- As of DB2 10.1 fp2 and later, you can use the ADMIN_MOVE_TABLE
procedure to move tables that have foreign keys.
- Use the SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO table function
to estimate the space savings that you could get for a table or all tables
in a schema if you were to use compression.
- Use the command “db2pd -tablespaces trackmodstate” to see
if any pages have changed in the tablespace since the last backup. The
TRACKMOD configuration parameter must be on for this information to be
gathered.
- If you add new partitions to a table by loading a new
table first that has compression turned on and then adding or attaching it
to the partitioned table, then you can have a different static compression
dictionary for each partition. So in cases where your newest rows go into
the newest partitions, using table partitioning is a way to keep you
static compression from degrading over time.
- Don’t forget about the “RECLAIM EXTENTS” option introduced
in 10.1 to the REORG command that allows you to do a lightweight reorg
that just gives unused extents from tables and indexes back to their
respective tablespaces.
- As of 9.7 you can create Reclaimable Storage Tablespaces
that can be easily reduced in size using the ALTER TABLESPACE command
without worrying about the high water mark. If you have upgraded from an
earlier version, you need to move tables from old tablespaces to new ones,
possibly using the ADMIN_MOVE_TABLE() procedure, to be able to use this
feature.
- DB2 10.1 also introduced archive log compression in all
editions using the LOGARCHCOMPR1 (2) DB CFG parameters. There are few
cases where this should not be used.
- DB2 10.1 introduced system period temporal tables so don’t
write a bunch of triggers if you need to keep a history of changes to any
of your tables.
- Remember the db2cos
and db2fodc
commands when you have problems like lock timeouts, system hangs and other
problems.
Upgrading to DB2 10.5
Melanie Stopfer gave a great presentation on how upgrade to
DB2 10.5. For lots of other good upgrade information see the DB2
Upgrade Portal. The bullets here on this topic are just some highlights
from her presentation:
- If you are currently running on DB2 9.5 or earlier, you
will need to upgrade to 9.7 or 10.1 before you can upgrade to 10.5.
- IBM does not offer the InfoSphere Warehouse (ISW) editions
anymore. They have been replaced with the DB2 Advanced editions. If you
are currently running ISW then your upgrade path is to go to DB2 10.5.
You can upgrade ISW v9.7 or later versions to DB2 10.5 directly. Depending
on the edition of ISW you are using, you will either see DB2 Advanced
Workgroup Server Edition or DB2 Advanced Enterprise Server Edition show up
in your Passport Advantage account as a replacement.
- Use the db2batch utility to benchmark your important
queries before and after the upgrade to ensure that you have not caused
any problems.
- A number of operating system versions are no longer
supported. Make sure to view the systems requirements page in the DB2
10.5 information center before starting your upgrade.
- You can run the “db2prereqcheck –v 10.5.0.0” and the
db2cupgrade commands to verify that your system and instance is ready to
be upgraded. This command is run automatically when you upgrade an
instance but it is nice to know before hand if you can upgrade.
- After upgrading instances on Linux and UNIX systems, you
can run the db2val command to validate your instances.
- You will need get the 10.5 license key (Activation File) and
apply it after the upgrade.
- An SSH sever will be installed and a service will be
created for it when you install DB2 10.5 on Windows. This allows Data
Studio to perform certain commands like starting an instance from a remote
client without using the DAS.
- You will need additional disk space during the upgrade and
the SYSCATSPACE and TEMPSPACE1 table spaces will likely need to be
enlarged. See the Upgrade Portal for more information. You will also need
to increase logging parameters.
- It is a good idea to clean out the diaglog path before
starting your upgrade so it is easy to see what diagnostic files came from
the new version.
- The UPGRADE DATABASE command now has the –REBINDALL
parameter to rebind all of your packages during the upgrade.
- Instead of binding the various client versions you may
have from those clients, you can just download all of client bind files to
your database server and bind them there. This way you don’t have to
worry about clients at supported versions because they are not bound to
your database. You can find them at the bind
file site.
- It is a good idea to upgrade the explain tables after
upgrading the database so that people can continue to explain their
queries. Use the db2exmig command to do this.
- If your database started life as an older pre-9.7 version,
it is a good idea to verify that all of your DMS and automatic storage
tablespaces are LONG tablespaces. You can determine this by using the
SYSCAT.TABLESPACES.DATATYPE catalog view and then alter the tablespace
with the CONVERT TO LARGE parameter to change non-long tablespaces to
long. LONG refers to the number of bits used in the RID and not the type
of data stored in it.
- If you upgrade from a pre-10.1 database and are using
compression make sure to alter your compressed tables using the COMPRESS
YES ADAPTIVE go begin using adaptive compression in addition to the static
compression already being used.
IBM Data Studio
With the release of DB2 v10.1 there is no more Control Center and you should start using the IBM Data Studio.
The Client Configuration Assistant is also gone. Nearly all functions of
control center are now in Data Studio plus it has several others such as a
procedure builder and debugger and a feature to generate scripts to change
database schema while preserving data. For a summary of the tool, please
see my series of articles starting with Data Studio Update Part 1
or see my Data
Studio Web Console article to see how to do health monitoring and task
management. A number of improvements have been made with the release of
Data Studio 4.1 including being able to administer the new features of DB2
v10.5 like BLU.
***
I hope that you
found at least a few pieces of information in this article to be new and
useful. I hope to see you at the conference next year.
HOME | Search