IBM POWER61 processor-based servers and Sun/Fujitsu’s SPARC64 based M9000 servers have2 something called Capacity On Demand (COD)3. These machines have spare resources, videlicet CPU and memory banks, that lay low until situation warrants using them. When more memory is needed, the spare RAM banks pitch in. When more CPUs would help deal with the load, the piper is paid and voilA , the hot-spare CPUs join the processing army giving “instant” capacity. Capacity can be reduced as well, leading to potential savings. The basic premise of COD is that you pay for what you need – no more and no less. Upgrades and downgrades happen without putting something in or taking anything out. Capacity planning magically happens completely in software, potentially on auto-pilot.
The magic that turns resources on is called a hardware license, a code of some kind (similar to software license key codes) that makes the upgrade request legitimate. Like a software license, a hardware license can be made to expire. Unlike most software licenses, a hardware license may pertain only to times when the machine is powered on. In other words the hardware license is not used up when the machine is powered off.
Of course this leads to interesting situations for software licensing. If a database engine is running on a per-processor licensing scheme, and more CPUs are brought online, software may end up running on more processors than it is licensed for. Without some way to quickly acquire software license(s) needed, the hardware upgrades will not work very well. Software licensing perhaps needs4 new equivalents to hardware licensing on COD big irons.
The concept of COD probably evolved from designing redundant, fault-tolerant systems with hot-spare resources that kick in to replace faulty cpu or memory bank, which begs the question how does all this work ?
Hot-spare CPU would generally require transitioning entire CPU state to another CPU with operating system coordination. As a new feature in POWER6, this is achieved by what IBM calls Instruction Retry and Recovery5 (IRR). The part of the CPU that deals with recovery functionality is called Recovery Unit (RU). The RU enables the CPU to checkpoint6 at appropriate instruction boundaries so it can recover from soft errors while executing next bunch of instructions. Hot-spare works by extracting the faulty CPU’s last checkpoint, verifying that the checkpoint data is valid by looking at error-reporting registers and then loading the checkpoint on hot spare. This is all in hypervisor firmware logic and is unbeknownst to OS.
SPARC64 also seems to have similar backward error recovery mechanisms. There are history circuitry that keep record of all CPU operations and all ALU registers have parity checking. When there is error detected during execution of an instruction, Instruction Retry kicks in. If retry is unsuccessful, after a number of retries fail, the OS (which in the case of M9000’s is Solaris 10) is notified of error. Dynamic Reconfiguration, the process of assigning hot spare, presumably7 takes off from there with OS involvement.
Happy new year everyone !
1POWER6 is a unique RISC processor in the sense that it runs at clocks 5 GHz+, IBM figured how to address power leakage problems associated with high cpu clock rates. Intel/AMD could not. The fastest clock for an Intel processor today according to intel.com is 3.2GHz. Go here to read about what happens when POWER6 processor is hit with proton and neutron beams !
2HP RP8400 servers have Capacity On Demand features as well
3 Intoduction to Dynamic Reconfiguration and Capacity On Demand for Sun SPARC Enterprise servers
4Not considering SaaS style licensing
5 United States Patent 7467325 granted to IBM (12/16/2008)
6A checkpoint refers to a known good state of CPU that contains contents of all registers visible to software. There is one checkpoint maintained per POWER6 CPU. The checkpoints do not happen on every instruction but rather after a group of instructions.
7This is my best guess. I could not find definitive technical documentation as to how CPU hot spare/Dynamic Reconfiguration works with SPARC64 VII/VI processors in M9000. An interesting difference in SPARC V (vis-a-vis IBM POWER6) is that the checkpointing is at every instruction boundary. If Fujitsu did the same on SPARC64 VII/VIs, with all the extensive error checking circuitry that are in these CPUs, the benchmarks suggest this is one reliable and performant monster.