Written by Maria Schwenger, PhD
DB2 Open Database Technologies Group
30 March 2010
IBM® DB2® pureScale™ is a new optional feature in the Enterprise Server Edition of DB2 9.8 that is based on technology from IBM DB2 for z/OS® and delivers high and continuous availability, unlimited capacity, exceptional scalability with almost linear scaling, while staying completely transparent to applications and users. In this article we will make a short overview of the solution and its components as well will give ideas for when you should consider adopting it for your business needs. Do not forget to send us your feedback, ideas and questions – they are always greatly appreciated!
Short introduction to DB2 pureScale
DB2 pureScale is a cluster-based, shared-disk architecture that reduces costs through efficient use of system resources and allows you to scale out your database on a set of servers in an “active-active” configuration delivering high levels of both availability and scalability as well as application transparency.
Designed mainly for OLTP workloads with many concurrent transactions, DB2 pureScale offers unlimited numbers of members. Tests with up to 128 members show near-linear scalability. As soon as a new member is added to the cluster and becomes active, the new members can start processing transactions. In addition, IBM offers flexible licensing option for adding “capacity on demand”, which allows you to extend DB2 pureScale cluster by adding new members for a specific period of time and only pay for them for the time used. This is especially useful in times like end-of-year, when adding more processing power to your DB2 pureScale cluster could satisfy the increased number of users. In the same time when the end-of-year campaign is over, you could remove the members that are no longed needed and save money.
Another big benefit of using DB2 pureScale is that you do not need to change your application or redistribute data when running against the DB2 pureScale cluster. The structure of the cluster is transparent to users and applications. Users and applications are not affected when adding or removing DB2 pureScale member servers. Due to the automatic work load balancing, the work load within the cluster is automatically redistributed between the members to include the newly added members. In case we remove a member, the workload will be automatically redirected to the other working members.
For efficient coordination and inter-cluster communication, DB2 pureScale has implemented Remote Direct Memory Access (RDMA) as a standard communication protocol. When InfiniBand cards, switches, and cables connect processors to high-speed peripherals, the system could execute certain I/O operations without interrupting the central processor of a distributed system, bypassing the kernel, and saving processor resources for actual database transaction processing tasks.
By utilizing IBM PowerHA pureScale technology and redundant architecture, DB2 pureScale provides unmatched continuous availability for both planned and unplanned events. In case of component failure, the failover is managed automatically for you and the system recovers nearly instantaneously, immediately redistributing the workload to surviving member nodes. The remaining members will continue processing transactions and only small numbers of “in-flight” transactions from the failed member(s) are rolled back during automatic recovery.
DB2 servers can be taken offline for planned maintenance of hardware and software which allows service without interruption. As the example in Listing1 demonstrates this could be done with a few easy commands:
1. Ensure automatic client routing (default) or transaction level workload balancing is enabled
2. db2stop [member] 1 quiesce [timeout]
3. db2stop instance on <hostname>
4. db2cluster –cm –enter –maintenance (executed on <hostname>)
5. Perform desired maintenance (eg. install AIX PTF)
6. db2cluster –cm –exit –maintenance (executed on <hostname>)
7. db2start instance on <hostname>
8. db2start [member] 1
When to consider DB2 pureScale?
It is hard to summarize all of the capabilities DB2 pureScale offers in a few bullet points, but in general, you should consider utilizing the power of DB2 pureScale when any of the considerations below (or combination of them) are true:
• You need high availability solution for your up-time OLTP systems
• You are looking to increase the workload and the scalability of your systems
• You need a scalable “active – active” solution for you OLTP applications
• You do not want to make changes to applications in order to achieve scalable cluster implementation
• If you need to be able to add processing power on demand (for pick times) and you want to pay only for what you use
• You do not want to negotiate and schedule maintenance down time.
• You need to upgrade your OS and Hardware with no down time
• You would like a single solution that will manage your entire cluster.
• You would like single centralized installation and upgrade process.
Structure of DB2 pureScale cluster
Based on undisputed gold standard of reliability in the data sharing architecture - System Z, DB2 pureScale integrates deeply different hardware and software technologies such as InfiniBand network, Tivoli System Automation, GPFS, etc. Although combining all these different technologies, DB2 pureScale is a single stand alone solution and features a single automated installation process for all software components. The complexity in an HA solution is never a good sign – the more complex is the system, more potential points of failure there are and the harder it is for the users to adopt it.
The architecture of DB2 PureScale is actually easy to understand. As shown on chart 1, this is a shared data architecture that consists of several main components:
• Clients - to provide connectivity and automatic workload balancing;
• Member servers – DB2 engine that runs on several hosts to process the data, providing coherent access to the database from any member;
• PowerHA pureScale server(s) - it is derived from System z Parallel Sysplex and the Coupling Facility technology. Inside the cluster, the PowerHA pureScale server is the “traffic cop” that coordinates access to the shared data by multiple members and to ensure that data is consistent across all members. It provides locking and cache coherency services to all the members. It is also knows as Coupling Facility on DB2 for z/OS. You may see it as CF in short in many charts and documents.
• Low latency, high speed interconnect - especially optimized to utilize the significant advantages of RDMA-capable interconnects (eg. Infiniband)
• Integrated cluster services – utilized for a failure detection, recovery automation, etc. developed in partnership with STG and Tivoli.
• Data sharing architecture - the database is located on shared storage (GPFS) so that every member in the cluster can access any page in the database at any time. On the same shared storage are placed the logs of each members so in case of recovery, they will be available to the other members.
For More Information
DB2 pureScale is the answer for one of the hottest requests of the market today – an active-active, “scale-out” high availability architecture built in mind for performance, continuous availability with automated recovery, and application transparency. Many companies are actively researching and considering this solution for their up-time systems. Working with IBM DB2 Lab provides numerous ways for our customers and partners to get first hand information and explore/experience DB2 pureScale in action. To learn more about how you can try IBM DB2 pureScale for free, please contact us at firstname.lastname@example.org or visit ibm.com/db2/pureScale. To find about the latest events on IBM DB2 pureScale, follow DB2EmergingTech (http://twitter.com/DB2EmergingTech) on twitter.com (http://twitter.com).
More technical information is also available on Developer Works.