“Packed Decimal Pachyderms” — that’s us. Mainframe Elephants — long memories, very versatile, and — until recently — mostly grey. But if the pachyderm reference is obscure, the packed decimal part probably qualifies as obfuscation if used beyond the scope of mainframe technical insiders.
That’s funny, because it’s part of an entire range of strengths that make the mainframe the definitive platform for world-class business computing: its decimal orientation.
And, it's implicit in a lot of things we take for granted about the mainframe, beginning with our code page of choice. Did you know that the System/360 mainframe hardware architecture was designed to work with EBCDIC or ASCII, but the folks that programmed OS/360 and the related original systems for the mainframe were so focused on creating the EBCDIC version that they kinda forgot to do the ASCII one? (OK, actually, IBM's original implementation of ASCII was before ASCII was standardized, and didn't match what became the standard. IBM customers were happy with EBCDIC and didn't need ASCII support, so it was quietly dropped.)
What makes EBCDIC so special? It’s that fourth letter — the “D” — it means “decimal” and it’s a definitive and pervasive aspect of how the mainframe thinks. Take it apart further and we see the archaeology of this acronym. First there was Binary Coded Decimal (BCD), a six-bit encoding. Then it was extended further by an additional two bits (kind of like XA subsequently extended System/370 architecture). Then it was made an official character set for information interchange: Extended Binary Coded Decimal Interchange Code. And that character set was based entirely around base 10, including the low-order nibble of every byte.
No, seriously. Half a byte is a nibble (you probably already knew that), and in EBCDIC the bottom nibble is special when it’s a decimal value as seen in hexadecimal (base 16). That means the alphabetic letters and numerical digits all have a value from zero to nine in their low order nibble, even though that means there’s an interruption between I and J and between R and S where other non-alpha characters lodge. That has allowed for some interesting processing to take place over the years and could make for many follow-on articles, but I’m going to shift our gaze instead to the last group of values in the character set.
Yeah: last. In ASCII, the digits come before the uppercase alpha, which come before the lowercase alpha (each of which is grouped with contiguous values), and they’re all under 128, or 80 in hexadecimal. But in EBCDIC, most of the interesting display characters have values above 80 hex, beginning with lowercase alpha (91 through B9), then uppercase alpha (C1 through E9) and finally numerical digits (F0 through F9).
Oh, those digits. You can really count on them. In fact, you can do math directly on the EBCDIC versions of them if you wish. Or you can squeeze them into half the space by packing them — that is, removing the “F” from the top half of each digit and pairing adjacent digits together into a single byte. So F1F9 packs into 19 — hexadecimal. Yes, a hexadecimal two-digit value that looks like a two-digit decimal value.
What the 1111? That means that a byte can only have 100 possible values of use rather than the 256 values that a binary number allows for. Who can justify such inefficiency? ASCII and consumer electronics commodity computing sure can’t.
But business value says that the consistency and integrity of having decimal arithmetic “all the way down” through the programming language (e.g. COBOL) and straight into the hardware architecture means never having to say “oops” when doing financial calculations. Your pennies are safe from becoming artifacts of floating point conversions between stored binary and displayed decimal. And those pennies add up quickly at a trillion transactions per day.
So, way back in the misty past when the decision was made to respect decimal arithmetic on the mainframe, a legacy was created of decimal integrity, even at the sacrifice of a much larger range numeric capacity within the same number of bytes. And that decision is not just inherent in the way the architecture and the languages that are optimized for it handle basic decimal data storage. It manifested in a journey of discovering every way that decimal math could be optimized for business purposes over the past half-century-plus, leading to two of the greatest modern innovations on IBM Z: the IEEE-754 decimal floating point instruction set and the modern Vector Decimal instruction set. (DFP, as it’s known, is a hardware implementation of the Rexx decimal arithmetic processing that was invented by the King of Rexx, Mike Cowlishaw.)
By moving the processing of this most important business numeric data at the largest scale directly into hardware instructions, IBM has optimized financial (and related) numeric decimal data processing by orders of magnitude. In fact, it’s one of the main reasons that organizations are using IBM's ABO (Automatic Binary Optimizer) to “recompile” their load modules to take advantage of modern compiler and architecture advances without even bothering to spend CPU on the source compilation process: just to get those amazing advances from hardware decimal math optimizations.
Interestingly, this handy legacy advantage hearkens back to the earliest pre-electronic-computing advances of business math: base 10 works for people and the business we do, and it’s no mere prestidigitation. Count on it!