Many companies with large data workloads want to employ machine learning (ML) and artificial intelligence (AI) to gain insights into their outcomes and improve business operations. In recent years, enterprises have trained AI models with graphic processing units (GPUs), for instance, but when applying these models to workloads, challenges can and do arise. For instance, when embedding AI inferencing tasks into an existing enterprise transaction, organizations can find that transactions slowed down as AI inferencing when invoked off the IBM Z platform works to influence the transaction outcome.
IBM announced the IBM Telum Processor in August 2021. The processor has a redesigned cache infrastructure, coupled with performance optimizations for new and traditional workloads, and a new accelerator for AI. Dr. Christian Jacobi, Distinguished Engineer at IBM Systems Z Hardware Development and chief architect of the processor, says that he is excited about the possibilities of this chip, as it becomes the central processor chip for the next generation IBM Z and LinuxONE systems.
Businesses looking to move from fraud detection to fraud prevention could find the Telum processor a big asset. The chip's "real-time capability is crucial, for example, if clients want to use fraud models and stop fraudulent transactions from completing rather than dealing with fraud after the fact," he says. "Embedding AI directly into their real-time transactions can help clients improve their businesses by preventing fraud, improving customer retention, or bolstering the efficiency of their IT."
As described at Hot Chips in August 2021, the microprocessor contains 8 processor cores with a speed of over 5 GHz, with each core supported by a redesigned 32 MB private level-2 cache. The level-2 caches interact to form a 256 MB virtual Level-3 and 2 GB Level-4 cache. IBM says that along with improvements to the processor core itself, the 1.5 times growth of cache per core over the z15 generation is designed to enable a significant increase in both per-thread performance and total capacity IBM can deliver in the next generation IBM Z system.
Reduced AI Inference Latency
Telum is optimized, Dr. Jacobi says, to run a mix of typical enterprise workloads like database and transaction logic, as well as AI inferencing work. To address AI inference latency, IBM designed a centralized accelerator on the chip that all cores have access to, he adds.
"Instead of spreading the AI capabilities like peanut butter over all the cores, IBM aggregated that computing capacity in one place, and each core can use the entire capacity whenever its workload branches from traditional work into AI work," Dr. Jacobi explains. This enables the chip to minimize AI inferencing time and allows clients to directly embed AI into their transactions without breaking the transaction response times. "With more than 6 trillion floating point ops (TFLOPs) per chip and more than 200 TFLOPs per system, we have ample computing capacity to enrich every transaction with AI," he says.
Telum and Security
According to Dr. Jacobi, "Telum can enable clients to grow their workloads and react dynamically to extreme workload demand spikes, such as the exceptional trading days of 2021. Moreover, IBM has invested in availability and reliability since we know how critical these systems are to our clients' businesses." Because security is critical to any businesses in these days of increased ransomware attacks and data breaches, Telum's crypto accelerators have been re-optimized within the context of the new cache design.
Telum also has physical memory protection that transparently encrypts all the data that leaves the chip to the main memory, he adds. The improved performance designed into the Secure Execution environment enables systems to run containerized workloads in a way where the hardware helps ensure confidentiality and the integrity of the container workload.
Dr. Jacobi says it took six years to design the Telum chip, but "there's more innovation coming at the system level as we integrate the chip into the complete zNext system stack.” According to Dr. Jacobi, IBM is already investigating future chip designs. A design objective for next-generation chips is to have improved AI capabilities, improvements in security and cryptography, and enhancements to performance and capacity. “The roadmap for future chip development is exciting.”