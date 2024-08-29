The latest advances in the IBM Z mainframe provide an architecture to run AI models that enterprises could find helpful as they cobble together off-the-shelf components for their systems.

Next year, IBM plans to start offering the Z mainframe with the second version of the company's Telum processor and a new AI accelerator called Spyre. The chips will work together to power the AI applications of most Fortune 500 companies, including banks, insurers, retailers, carriers, and airlines.

AI applications on the mainframe must keep up with software processing 100,000 transactions per second, so IBM has designed the Telum II with eight AI accelerators, each capable of 24 trillion operations per second. That is four times the speed of the first Telum.

To boost Telum performance, IBM offloads some tasks from the chip's CPU to a data processing unit. The DPU processes network and storage traffic so the CPU can handle mostly transactions and database queries.

The Spyre has 32 AI accelerators, so companies can use it to run AI models larger than those on the Telum II, Christian Jacobi, CTO for Z system architecture and design, said. The Spyre works concurrently with the Telum II to power the total AI system.

For example, a credit card company would use the AI accelerator in the Tellum II chip to run a typical 100,000- to 1 million parameter machine learning model to look for fraudulent transactions. Questionable transactions will pass through a 100-million parameter model to get a ranking for the possibility of fraud. A transaction that has a high score would get rejected, while those with a low score are processed.

Another example is a healthcare provider using the mainframe for biopsy image analysis to avoid sending sensitive medical information to other systems, Jacobi said. The models can determine the type of cancer and its severity.

"They are categorization models, not generative (AI) models," Jacobi said. "There's a great use here of combining traditional models with the classification type of large language models."