Petya Petrova - Fotolia

GPU cloud tools take complexity out of machine learning infrastructure

While talk of AI on GPUs is abuzz, actually building a machine learning infrastructure remains a dark art. A startup's PaaS is looking to automate parts of the process.

Deep learning and machine learning are ushering in new waves of innovative hardware, but getting jobs running on such advanced infrastructure is not easy. Despite their high hopes, teams can burn up a lot of development time trying to move their jobs to the latest GPU or TPU machine learning infrastructure.

This issue drives a lot of interest in data science platforms that can automate the effort to pack up libraries and other software elements in containers to successfully run on machine learning infrastructure in a GPU cloud.

Cloud vendors, Hadoop distribution providers, machine learning specialists and others have been rolling out such software. Among the startups chasing the problem is PaaS specialist Paperspace Co. The Brooklyn, N.Y.-based company has been enhancing its software to enable a wider group of developers to implement highly iterative, neural network-based AI workloads on GPUs and other hardware.

Job runner to the cloud

Paperspace launched its Gradient tool suite earlier this year to train neural networks and run GPU cloud jobs. The suite includes a job runner that enables developers to send local jobs to the cloud to be processed while supporting containers and Jupyter Notebooks.

GPUs can provide a type of parallel computing that wasn't readily available in the past. The chips, which provide high bandwidth to address large data sets, are becoming widely adopted for use in training deep learning neural nets, according to Dillon Erb, co-founder and CEO of Paperspace. But it is a brave new world for most everyone, including software developers.

"We may be entering a golden era of hardware, but it means taking on a lot of infrastructure management," Erb said.

GPUs are just part of the complexity to come; over the next couple of years, the number of heterogeneous hardware options will increase, which he believes is also a good reason to employ a technology architecture that will abstract the hardware away.

"Today, people interested in machine learning can spend enormous amounts of time building the pieces and keeping them together," Erb said, as he detailed Paperspace's commitment to find ways to build out that infrastructure as part of a cloud service.

One medical doctor and technologist credits the Paperspace platform for easing the work required to run machine learning jobs on a GPU cloud.

"Our team of engineers had used a bunch of different platforms. We found Gradient could get us started with good results in a fraction of the time the others took," said Dr. Chris Morley, co-founder and COO of MediVis, a New York-based startup working to bring the capabilities of augmented reality into diagnostics and planning for medical operations.

Morley said Gradient consolidates a number of steps related to setting up machine learning infrastructure.

"It automates things that drain developer time, such as handling dependencies of different AI libraries and their different versioning," Morley said. "There are so many moving parts to projects like this."

Machine learning tools, GPUs in future of medicine

Morley's team could use such help, as the goals of MediVis are dramatic. The company has spent about two years working with augmented reality technology to create a front-end system that renders holographic representations of MRI and CT data, enabling surgeons and diagnosticians to explore digital representations of patients' bodies and pathologies.

Deep learning in medical imaging is exploding. GPUs are among the underlying technologies that make this possible.
Dr. Chris Morleyco-founder, MediVis

MediVis is following up that effort with back-end work to apply AI methods to the detection of tumors. The goal is to automatically segment and classify MRI and CT data to identify tumors and other irregularities. That calls for powerful processing.

"Deep learning in medical imaging is exploding," Morley said. "GPUs are among the underlying technologies that make this possible."

Whether doctors are looking at a tumor or at someone's normal anatomy, a lot of time is spent manually highlighting elements in data sets, Morley noted.

"Using Paperspace's tools along with traditional methods can save time there," he said.

At each step, saving effort is vital. Cutting the time required for developers to set up machine learning jobs may be as crucial as cutting the time required for diagnosticians to manually inspect medical images.

In any case, Morley said, "The goal is not to be burning all your time on the infrastructure side of it."

Paperspace isn't the only machine learning platform that supports GPU cloud jobs. There is also H2O.ai's GPU-specific H2O4GPU platform, FloydHub and Amazon Web Services' GPU cloud offerings, to name a few.

Dig Deeper on AI infrastructure