Use Static SQL to Enhance Performance
25 July 2011
Are you looking for a way to easily make your transactional applications perform better or to reduce the CPU resources needed on your database server while at the same time improving security? You can do all of these things with Static SQL in DB2! Yes static SQL has been around for a very long time, but we now have ways to allow you to start using it without modifying your applications. It helps for all flavors of DB2 including LUW, Z and I. In this article I’ll describe what static SQL is, why it performs better and provide links to where you can learn how to use it.
What is static SQL?
Static SQL is SQL that has been compiled into the database using the DB2 “BIND” command or an API that calls BIND for you. Once the SQL has been bound then the application references the already compiled SQL statements in the database instead of having the database evaluate the entire query again and again, saving the database the work of compiling the query each time. You can bind a file of several SQL statements using one bind command and you will typically bind all of the SQL statements issued by one application with one bind command. With newer tools, you can use static SQL without rewriting your applications. Figure 1 below, provided to me by Mr. Vijay Bommireddipalli, shows why there is less work for the database server to use static SQL at run time:
Figure 1: Static vs. Dynamic SQL Steps at the Database Server (Vijay Bommireddipalli)
As you probably already know, this diagram shows what is done if statement is not already in the package cache. When a dynamic query is found in the package cache, it does not need to go through the entire set of steps shown in figure 1, but static is still more efficient as there are still fewer steps. For example, the only authorization that needs to be examined for static packages is whether the issuing user has execute permission, where as the authorizations for every table in the query need to be examined for dynamic SQL.
In addition to the static SQL performing better, it will normally also perform at a consistent speed every time a statement is executed because it will always use the same access path that was determined by the optimizer at bind time. That is, if the optimizer chose to use a particular index when the package was bound then it will continue to use it until the package is rebound. Since dynamic SQL can choose a different path every time it is run it can give “surprising” performance as Bryan Smith said at the May 2011 Utah DB2 User Group meeting. Bryan further noted (and I agree) that not all surprises are good surprises though!
From a performance perspective some databases will benefit from static SQL more than others. Ones that will typically benefit most are databases that have a high volume of unique SQL statements. The more statements the database must processes, the more value that you get from saving resources on each statement. Some customers have seen very noticeable improvements in speed and reduced resource consumption on their database servers. It should be noted here that although this is called “static” SQL you can still have parameter markers or host variables in the predicate (where clause) of the queries that can be replaced by various values. The application just passes the values of the parameter markers to the database along with the call to the static query.
When DB2 compiles any SQL, whether it done statically or dynamically, the output including the access path to the tables, is put into a database object called a package. The package for a dynamic statement is only placed in the package cache, and is deleted when room is needed in the cache for newer queries or when the cache is flushed. When a package is created for SQL by a bind command, then the package is stored in the database catalog and placed in the package cache when needed. The developerWorks article called “DB2 packages: Concepts, examples and common problems” gives a very good description of what packages are and describes packages created from “embedded SQL” applications and system packages. Since it was written additional ways of creating the packages have been introduced, but the concepts are still quite valid.
Traditionally, the use of static SQL has been done through the use of “embedded SQL” where the SQL is embedded in the application programming language like COBOL. This is still a viable option and is still extensively used, but we now provide tools like Optim pureQuery Runtime that allow you to use static SQL without having to design your application for it.
Embedded SQL and SQLJ
Embedded SQL has been used since the beginning of DB2 on the mainframe and has been part of DB2/LUW (Linux, UNIX and Windows) for most of its life as well. When you use embedded SQL, your program is first run through a pre-compiler that has two main output files. The first is a file of the SQL extracted from the program. This is called a bind file in the LUW world and a DBRM in the mainframe world. The other main output is a copy of the input source code with the SQL replaced with function calls to the static package that will be created. Each SQL statement is given a unique identifier. After the pre-compiler step is completed the bind file is bound to the database using the BIND command and the new copy of the source program is run through the regular language compiler, for example the COBOL compiler.
As you know JAVA is not a compiled language, but it can still use static SQL in a similar fashion to embedded programs using an ISO/ANSI standard language extension called SQLJ. SQLJ is run through a pre-processor that performs a similar function to the pre-compiler described above. SQLJ has enjoyed only limited acceptance so I’m not going to talk much more about it here, but you can read the SQLJ developerWorks article if you are interested.
Optim pureQuery Runtime
With Optim pureQuery Runtime you can make a program that uses the IBM type-4 JDBC driver or CLI driver use static SQL instead of dynamic SQL without changing the program. At a high level the steps involved in doing this are:
There are two main ways to get the SQL from a program into a file to be bound to the database. The first is for JAVA programs that can be created in the Optim Development Studio environment or imported into it. This does not really work for purchased applications or applications that use a framework like Hibernate to generate the SQL, so I will not spend much time on it. In this scenario, the Optim Development Studio can just extract the SQL directly from the program. This workstation tool is now available at no charge if you own DB2/LUW.
The other way, that is applicable to all applications that use the IBM JCC driver (JDBC type-4) or the CLI, is to have pureQuery watch the dynamic SQL that passes through the driver and put a copy of every statement into a file. This can be done in a development or production environment. The good thing about capturing the SQL in a development environment is that you can force the application to run every statement that it can ever run so that you know everything has been captured. Although this is not exactly how pureQuery works, the way that I think of it is that it plugs into your JCC driver or CLI driver. As queries go through that driver on their way to the database, pureQuery sees them and places them into a file.
Once you have the SQL in a file you can then run a prueQuery command that binds the SQL in the capture file to the database. Once that is done you can change a setting in the pureQuery properties that tell it that when it sees a dynamic statement passing through the JCC or CLI driver that is in the SQL file created in the capture process then it should forward a call the static package instead. This way the database only sees requests for the static versions of the captured statements. For queries that are not in the capture file you can optionally allow them pass to the database as a dynamic statement or return an error to the program without passing it to the database. Doing the later can help you stop SQL Injection hackers.
For detailed instructions for capturing, binding and substituting static queries, please read the pureQuery Client Optimization Tutorial.
It should be noted that the static SQL substitution or “Client Optimization” is only a small part of what Optim Development Studio and pureQuery do and they have many more features that can be of great use to you.
In addition to the performance benefits, exclusively using static SQL has a great security benefit – none of your users or the generic user ids that application servers use to connect to your database have to be granted any authority on any production tables at all! That’s right! None of your regular users can look at or change any tables directly, even if they are SQL experts. The reason is that the users ids connecting to the database for regular work only need to be granted EXECUTE permission on the packages that they need. This allows them to run the SQL in the packages, but not to issue any SQL that they might write. They do not need any direct access to the tables. Only the person who binds the package needs to have authorities to the tables. This way even if a hacker gets into your application server or somehow gets access to your database, he can not select any tables. He would only be able to execute packages and he would have to know the identifiers of the packages and SQL statements within them to even be able to do that.
It is known to be a good practice to run RUNSTATS regularly. However, if you are using static SQL the new statistics will not be used by existing packages. This begs the question about when to rebind your packages to pick up the newest statistics. In general you will only need to rebind your packages on an infrequent basis, say quarterly or annually to pick up the latest statistics. This assumes that no tables in the database are radically changing in size. However, there are a number of reasons that you may want to rebind certain packages:
You can rebind all packages in the database with one command: the db2rbind command.
Whereas static SQL has many advantages and is useful for most applications, it is not always the best. This article gives some guidelines for when to use Static SQL instead of Dynamic.
I hope that you found this discussion of DB2 Static SQL useful. Please “like” my Facebook db2Dean page at facebook.com/db2dean and add any feedback about what you liked or what I can improve. You don't have to be a member of Facebook to add comments. I can only make things better with your help!