isham research

The IBM mainframe starts to slide beneath the covers

January 2006 update - see DB2 offload processor for zSeries

IBM's zSeries Application Assist Processor (zAAP) announced for the z990 and z890 presents a new way of lowering mainframe software costs.

The concept of "offload engines" is old - for many years there were rumours (especially when IBM was competing with Amdahl and Hitachi) that significant DB2 functionality would be hived off into special purpose database processors. Pure FUD, as it later turned out. Various mainframes had so-called "Vector Processors" added at one time. Over the years the motivation has changed - a desire to differentiate against competition or improve performance has become a need to reduce costs. Especially software costs. So the primary advantage of offloading processor cycles from z/OS is a reduction in the perceived size of the processing image - and thus software charges. But, of course, if the cycles are taken off the mainframe then IBM sells less - and the profit margins on mainframes are good.

So the trick is to provide functionality within the mainframe environment, but hide it from standard software pricing algorithms. Leaving out trivia, the first technique was supporting Linux on the zSeries mainframe in a dedicated partition that cannot run the z/OS operating system - the Integrated Facility for Linux, or IFL. Thus zSeries can support, e.g., web-serving applications that access large databases stored in a mature z/OS environment within the same system without impacting charges for unrelated systems.

Although the world's attention is usually directed to the "blade" scene, zSeries is actually achieving more multiple function with its processors than any other platform. A zSeries processor can be defined as a "standard" instruction processor, as a System Assist Processor (SAP), as a Coupling Facility (CF), as a z/OS.e only processor, as an IFL and now as a zAAP. This flexibility within a single product is remarkable.

The zAAP is the next stage in this development. As announced, it will execute Java code within a z/OS LPAR (from z/OS 1.6 onwards, requiring the Java 1.4 SDK which is not available until 17 December 2004) but outside the normal processor time collection schemes.

The zSeries IFL has a couple of interesting characteristics. First, IBM has committed to maintaining unit pricing - thus moving at least some mainframe functionalíty onto the same price/performance curve (although not at the same level) as Intel. Second, all IFLs run at the native speed of the hardware even where standard processors are "kneecapped". On the lower-performance models of the z800 and the z890, this produces a quite attractive machine - web serving at hardware speed, and back-office functions in a reduced performance environment to keep software costs down.

These characteristics are shared by the zAAP - and the performance issue becomes really quite interesting. On a 120 a zAAP would run at about eight times the speed of the z/OS native processor - 366 MIPS versus 45 MIPS. If IBM were to achieve its goal of offloading up to 50% of Java execution, a heavy Java-based workload could see a drop of around a third in the MSUs required to be configured for z/OS.

Running an application such as WebSphere under Linux/390 and accessing a DB2 database under z/OS is one thing - the interface is relatively simple and standardised. It's different if other portions of a z/OS workload are offloaded because the interfaces are not so well defined. Java is one such candidate; its much-touted platform independence is reflected in its inefficiency - on all platforms. It's a relatively poor performer under z/OS too (a factor of two or three) and some of the code paths are long. Java was extensively sold to many major corporations at executive level, and it may well be that these applications - especially WebSphere - are now coming into production with unexpected painfulness. On other platforms (e.g., IBM's own xSeries) regular performance enhancements can deal with the capacity requirement and software costs are not affected.

It may well be that some major users have expressed disquiet. But just passing Java over a microcode fence isn't so easy - there is no neatly encapsulated interface. The code needs to reference the z/OS address space, request z/OS system services and could cause events such as page faults that need to be handled by z/OS.

What this suggests is an engine within the z/OS LPAR - also running z/OS - that exclusively executes Java workloads. All that is necessary is for the Java workload to signal to the system that it is zAAP-eligible - after the next task switch, the workload will be dispatched on a zAAP if one is available. IBM's z/OS Performance: Capacity Planning Considerations for zAAP Processors, pp 2,4 seems to confirm the normal z/OS dispatcher is used as the switching mechanism, with an overhead ceiling very roughly guestimated at 5% but likely to be much lower.

There are other considerations. What, for example, will an ISV that writes in Java think of hiding Java execution from charging software? Such an ISV might want to force execution to remain in the z/OS environment, charged to the z/OS TCB and the customary SMF fields in, e.g., the Type 30 record - and there is a JVM startup switch to achieve this.

One long-standing principle - since OS/360 - has been that of charge predictability. In the old pre-paging days batch jobs were often charged by the wall clock, but multiprogramming - especially with paging - introduced unpredictability. A major investment produced MSUs - a metric that compensated for delays and extra I/O activity caused by "fellow travellers" in the same system. The zAAP introduces a completely new situation - CPU seconds consumed in a zAAP are normalised to the power of the standard CPUs, so a z890 6210 with two zAAPs could record many more CPU seconds than elapsed time multiplied by the total number of processors - making for an interesting capture ratio.

The zAAP, like the IFL, has a nominal unit price across hardware generations of $125k, though IFLs have been heavily discounted in recent months and the purchase price is not the whole story - maintenance is often significant, as with the z800 0E1. But this brings a whole new set of variables into decision making. The ostensible purpose of the feature is to reduce the apparent capacity of a z/OS LPAR - and thus its liability for software charges - by shifting cycles off it. To make a business case, a user has to establish that the cycles offloaded will be sufficient to lower the defined capacity of the LPAR enough to justify the cost of the zAAP. Although the JDK has a data collection capability from 1.3.1 onwards, the task of estimating whether and by how much this could reduce charges is far from trivial.

IBM's current software charging algorithms are based, for antiquated reasons, on a four-hour rolling average peak. A zAAP provides no cost savings at all unless it reduces this peak - but how will the user determine that the Java execution reported by RMF as eligible for offloading actually impacts CPU consumption during those critical four hours of each month? Again, IBM's own z/OS Performance: Capacity Planning Considerations for zAAP Processors, pp 10, 11 illustrates the problem - even a compelling-looking hypothetical case only reduces required z/OS MSU capacity by roughly (measured from the charts) 10% or so. This, of course, will not reduce software bills by anything like 10%, since MSUs become much cheaper at the top end.

And this hypothetical case illustrates a point made four years ago - when Variable Workload Level Charges were introduced - VWLC turns the rationale for batch window optimisation on its head. For years, users have tried to shorten their batch windows by tuning to utilise all available resources. Now, fear of the four-hour rolling average is causing batch windows to be capped and thus extended - sometimes impacting the working day.

IBM has stated that a zAAP will not be liable for charges for IBM software. With the zAAP running at native speed on "kneecapped" systems, a major step has been taken to get mainframe execution costs more into line with other platforms. The ISVs, however, may not take the same view - some were reluctant to accept the IFL concept even with the presence of a solid microcode fence; an engine that intrudes directly into a z/OS LPAR, runs faster than the other engines, and is invisible to them might be a tougher pill for them to swallow. It seems likely that some ISVs will count zAAP MSUs towards the LPAR's capacity.

Especially if they write some of their software in Java.

isham research Home Page