Sergey Nivens - Fotolia
The explosion of new AI chips is buoyed in part by the tremendous hype around AI. But this new class of emerging hardware -- purpose-built for the computationally intense algorithms used in deep learning -- also promises to address real problems for CIOs embarking on AI initiatives.
High-end AI chips of this ilk will enable the development of special-purpose deep learning infrastructure. Low-end AI chips are being baked into mobile phones from Apple and Google and into internet of things (IoT) devices.
However, these new AI chips will have to compete with CPUs, GPUs and, to some extent, field programmable gate arrays (FPGAs), the integrated circuits designed to be customized after manufacturing -- all of which will continue to be widely used. And even as the new AI chips make good on the promise to deliver better AI solutions faster, they also present new development and IT infrastructure challenges for CIOs.
"AI is bringing chip manufacturers out into the forefront," said Will Wise, vice president of security events at the International Security Conference and Exposition. "Years ago, the chips were behind the scenes, but now the decision on the chip maker is a big factor."
The excitement around these AI chips is focused on the efficiency and speed they bring to neural networks, a deep learning approach which aims to mimic the structure and function of the human brain. But there are a variety of other AI and machine learning techniques that won't necessarily benefit from these new chips. There are also development, performance and infrastructure cost trade-offs between CPUs, GPUs and FPGAs to consider before latching onto these new AI chips.
The call to action? CIOs should start by assessing their specific AI problems before adopting a chip architecture and deep learning approach.
A view of the AI chip landscape
Several well-funded AI chip startups are players in this field. They include Cerebras, Graphcore, Cambricon Technologies, Horizon Robotics and Nervana, which was recently acquired by Intel. Google's tensor processing unit (TPU) chip is the only one commercially available to enterprises, and then only in the cloud for back-end AI processing.
On the mobile and IoT side, AI chips are being baked into Apple's new phones, Google's Pixel and into embeddable modules from Nvidia, Qualcomm and Intel. This enables mobile devices to perform functions on the device without sending data to a cloud for processing. These on-device capabilities allow for mobile face detection, image detection, image tagging and real-time video processing. Building good AI models for these chips on the back-end has more to do with the AI frameworks, like TensorFlow, Nvidia's CUDA or OpenML, rather than with particular chip types.
Convolutional neural nets come of age
"Improvements in AI chip hardware have generally been focused on allowing a larger number of the calculations critical to training AI systems to be performed in parallel, allowing training to complete faster," said Robert Lee, chief architect with FlashBlade at Pure Storage Inc., a data storage business.
These kinds of parallel processing techniques are well suited for training deep learning's neural networks technology, which benefits from an iterative process that allows data science teams to refine and home in on the right models for a particular AI application. These models can then be deployed to cheaper services or to mobile and IoT devices.
The new class of AI chips helps speed up the computation-heavy portions of this training process. Better chip performance leads to faster convergence/development of an accurate model, an improved ability to develop new models with higher accuracy by providing better coverage of a larger search space, and the ability to train with larger datasets within the same timeframes.
Over the long term, big improvements in the computational capacity of this hardware will change an IT shop's deep learning approach, ultimately allowing for training more complex models on larger data sets. For example, in the autonomous driving space some practitioners are shifting training from 256-pixel to HD resolution images, thanks to the processing power of the new AI chips.
Much of the current excitement in this field of AI can be traced to the 2012 AlexNet paper that introduced a scalable approach to implementing neural networks across distributed hardware. "In many ways, AlexNet didn't introduce any new ideas," Lee said. "Convolutional neural networks had been studied and known for years. What was new was access to much faster hardware coupled with larger (and labelled) data sets, which made it feasible to exploit these techniques."
How AI chips might trim cost
The biggest impact with new AI chips is the potential for cost savings. "With GPUs, we're paying for a high level of precision that we just don't need for neural networks," said Kenny Daniel, CTO of Algorithmia Inc., an open marketplace for algorithms. The new chips have less precision and are, therefore, more power efficient. "This power efficiency is going to save a lot of money in the long run," Daniel said. "Because it is new technology, however, enterprises will need to experiment and work with leading professionals to develop those additional revenue streams."
Over the past year, the cost of GPUs has increased because of a surge in demand. Indeed, most people are buying GPUs for the machine learning/deep learning capabilities that are also handled by FPGAs and AI chips. But neural networks don't require the level of precision that graphics do. As a result, an 8-bit AI chip could perform just as well as a 16- to 32-bit GPU, resulting in cost savings, Daniel said.
Still a role for the CPUs in deep learning and AI
It is important for CIOs to start with a problem, find the best algorithms and then choose the appropriate hardware. Otherwise, they risk running less-efficient algorithms faster, said Rix Ryskamp, CEO of UseAIble, which develops customized AI algorithms.
For example, UseAIble reaped 1,000-times performance improvement by looking beyond traditional neural networks for its common business optimization problems. "Anything that is crunching raw mathematics will run faster on a GPU or any tensor processing chip," Ryskamp said. "Other algorithms such as genetic algorithms don't gain much from the mathematical processing in GPUs."
And some AI problems benefit more from faster development cycle times than from using better AI chips. Deep learning-based AI is computationally expensive. Traditional machine learning models are not. "We don't even need to train those models in cloud or local GPU," said Sid J. Reddy, chief scientist at Conversica, which develops AI software for marketing and sales. "Laptops work fine for most startup companies, and distributed computing platforms like Spark and Hadoop can help scale this better."
GPUs show growing promise
GPUs are generating a lot of interest, but some IT executives cited disparity between the tooling from different GPU vendors for deep learning and AI. "Nvidia is the only game in town today," said Daniel Kobran, co-founder of Paperspace, a machine learning tools provider. "Not only are FPGAs/ASICs (application-specific integrated circuits) difficult to use, GPUs other than Nvidia GPUs are difficult to leverage as well." Nvidia has invested considerable effort in developing the CUDA/cuDNN libraries that underpin most AI tools. It has also invested over $3 billion in developing the Volta chip that allows it to be dynamically reprogrammed to gain many of the benefits of FPGA, with fewer technical challenges.
Frameworks like OpenML promise to bring deep learning to other chips, but they are not as flexible as CUDA/cuDNN yet, Kobran said, adding that he expects this to change as frameworks like TensorFlow become more hardware agnostic.
FPGAs may provide an interim step
As enterprises await the wider availability of neuromorphic chips, some CIOs expect to use FPGA to test the performance and power efficiencies of GPUs versus neuromorphic chips. "FPGAs provide the ability for software to program the layout of a chip, so that you don't require manufacturing them," said Eric Hewitt, VP of technology at Understory, a weather prediction service. Although they may run a bit slower than new neuromorphic chips, FPGAs provide immense flexibility to iterate, test and learn before settling on a new chip design.
Indeed, Hewitt said Understory has decided against using AI chips for the time being until the technology matures. However, he added that for highly parallel, deterministic processing, FPGAs are currently only a good fit when the algorithms are well established. CPUs and GPUs are a better fit when Understory is experimenting with an algorithm's parameters, because the tools are easier to work with. But that could change as the tooling around FPGAs improves, Hewitt said. "FPGA tools are improving all of the time -- there are even Python libraries there."
Finally, CIOs may find that building an internal AI functionality should start with the human side of the process before choosing a particular chip.
"It's good to experiment with CPUs first," said Sreekar Krishna, managing director of data and analytics at KPMG.
GPUs can then be used to improve processing speed, but GPUs will also require software engineers and data scientists who know how to work with them. And a transition to FPGA will require hardware engineers to get the most from the AI initiatives. But the talent equation could change soon, experts said. As AI chips are improved to support frameworks like TensorFlow, it will become much easier for organizations of all sizes to jump on the latest chip without internal hardware expertise, Krishna said.