NicoElNino - Fotolia
AI at the edge allows real-time machine learning through localized processing, allowing for immediate data processing, detailed security and heightened customer experience. At the same time, many enterprises are looking to push AI into the cloud, which can reduce barriers to implementation, improve knowledge sharing and support larger models. The path forward lies in finding a balance that takes advantage of cloud and edge strengths.
Centralized cloud resources are typically used to train deep learning inferencing models because large amounts of data and compute are required to develop accurate models. The resulting models can be deployed either in a central cloud location or distributed to devices at the edge.
"Edge and cloud AI complement one another, and cloud resources are almost always involved in edge AI use cases," said Jason Shepherd, vice president of ecosystem at Zededa, an edge AI tools provider.
"In a perfect world, we'd centralize all workloads in the cloud for simplicity and scale, however, factors such as latency, bandwidth, autonomy, security and privacy are necessitating more AI models to be deployed at the edge, proximal to the data source," Shephard said. Some training is occurring at the edge and there's increasing focus on the concept of federated learning that focuses processing in data zones while centralizing results to eliminate regional bias.
Rise of edge AI
"The edge is a huge emerging shift in infrastructure that complements the cloud by adding on an information technology layer which is distributed out to every nook and cranny of the world," said Charles Nebolsky, intelligent cloud and infrastructure services lead at Accenture. Nebolsky believes edge AI is leading to a revolution as big as the cloud was when it gained traction.
When engineered well, edge AI opens new opportunities for autoscaling since each new user brings an entirely new machine to the collective workload. The edge also has better access to more unprocessed raw input data, whereas cloud AI solutions must work with pre-processed data to improve performance or enormous data sets, at which point bandwidth can become a serious concern.
"The reason for moving things to the edge is for better response time," said Jonas Bull, head of architecture for Atos North America's AI Lab, a digital transformation consultancy.
Speed and latency are critical for applications such as computer vision and the virtual radio access networks used for 5G. Another big benefit lies in improving privacy by limiting what data is uploaded to the cloud.
Edge AI's deployment is also full of constraints, including network latency, memory pressure, battery drain and the possibility of a process being backgrounded by the user or operating system. Developers working on edge AI need to plan for a wide range of limitations, particularly as they explore common use cases like mobile phones, said Stephen Miller, senior vice president of engineering and co-founder at Fyusion, an AI-driven 3D imaging company.
"You need to plan for every possible corner case [on the edge], whereas in the cloud, any solution can be monitored and fine-tuned," Miller said
Most experts see edge and cloud approaches as complementary parts of a larger strategy. Nebolsky said that cloud AI is more amenable to batch learning techniques that can process large data sets to build smarter algorithms to gain maximum accuracy quickly and at scale. Edge AI can execute those models, and cloud services can learn from the performance of these models and apply to the base data to create a continual learning loop.
Fyusion's Miller recommends striking the right balance -- if you commit entirely to edge AI, you've lost the ability to continuously improve your model. Without new data streams coming in, you have nothing to leverage. However if you commit entirely to cloud AI, you risk compromising the quality of your data -- due to the tradeoffs necessary to make it uploadable, and lack of feedback to guide the user to capture better data -- or the quantity of data.
"Edge AI complements cloud AI in providing access to immediate decisions when they are needed and utilizing the cloud for deeper insights or ones that require a broader or more longitudinal data set to drive a solution," Tracy Ring, managing director at Deloitte said.
For example, in a connected vehicle, sensors on the car provide a stream of real-time data that is processed constantly and can make decisions, like applying the brakes or adjusting the steering wheel. The same sensor data can be streamed to the cloud to do longer-term pattern analysis that can alert the owner of urgently needed repairs that may prevent an accident in the future. On the flip side, cloud AI complements edge AI to drive deeper insights, tune models and continue to enhance their insights.
"Cloud and edge AI work in tandem to drive immediate need decisions that are powered by deeper insights, and those insights are constantly being informed by new edge data," Ring said.
The main challenges of making edge and cloud AI work together are procedural and architectural.
"Applications need to be designed so that they purposefully split and coordinate the workload between them," said Max Versace, CEO and co-founder of Neurala, an AI inspection platform.
For instance, edge-enabled cameras can process all information as it originates at the sensor without overloading the network with irrelevant data. However, when the object of interest is finally detected at the edge, the relevant frames can be broadcasted to a larger cloud application that can store, further analyze (e.g., what subtype of object is in the frame and what are its attributes), and share the analysis results with a human supervisor.
One strategy lies in creating an architecture that balances the size of the model and data against the cost of data transfers, said Brian Sletten, president of Bosatsu Consulting and senior instructor for edge computing at Developintelligence.com. For large models, it makes more sense to stay put in the cloud.
"There are ways to reduce the model size to help resolve the issue, but if you are dealing with a very large model, you will probably want to run it in the cloud," Sletten said.
In other cases, when there is a lot of data generated at the edge, it may make more sense to update models locally and then feed subsets of this back to the cloud for further refinement. Developers also need to consider some of the privacy implications when doing inference on sensitive data. For example, if developers want to detect evidence of a stroke through a mobile phone camera, the application may need to process data locally to ensure HIPAA compliance.
Sletten predicts the frameworks will evolve to provide more options about where to do training and how to improve reuse. As an example, TensorFlow.js uses WebGL and WebAssembly to run in the browser (good for privacy, low-latency, leveraging desktop or mobile GPU resources, etc.) but also can load sharded, cached versions of cloud-trained models. Model exchange formats (e.g., Open Neural Network Exchange) could also increase the fluidity of models to different environments. Sletten recommends exploring tools like LLVM, an open source compiler infrastructure project, to make it easier to abstract applications away from the environments they run in.
"One of the key challenges in moving more AI from the cloud to the edge is coming up with neural network architectures that are able to operate in the edge AI chips efficiently," said Bruno Fernandez-Ruiz, co-founder and CTO of Nexar, a smart dash cam vendor.
General computing platforms, like the one found in the cloud servers, can run any network architecture. This becomes much harder in edge AI. Architectures and trained models must be adapted to run on the AI chipsets found at the edge.
Fernandez-Ruiz and his team have been exploring some of these tradeoffs to improve the intelligence they can bring to various dash cam applications. This is a big challenge as users may drive from highly performant mobile networks to dead zones yet expect good performance regardless. The team found that during inference time, there isn't enough network bandwidth to move all the data from the edge to the cloud, yet the use case requires local inference outputs to be aggregated globally. The edge AI can run neural networks that help filter the data which must be sent to the cloud for further AI processing.
In other cases, the cloud AI training may result in neural network models which have too many layers to run efficiently on edge devices. In these cases, the edge AI can run a lighter neural network that creates an intermediate representation of the input which is more compressed and can therefore be sent to the cloud for further AI processing. During training time, edge and cloud AI can operate in hybrid mode to provide something akin to "virtual active learning," where the edge AI sifts through vast amounts of data and "teaches" the cloud AI.
Fernandez-Ruiz has found the types of supported neural network architectures in edge AI chipsets are limited, and usually running months behind what can be achieved in the cloud. One useful approach for addressing these limitations has been to use compiler toolchains and stacks like Apache TVM, which help in porting a model from one platform to another.
Another approach has been to use network architectures known to work well in edge AI, and train them directly for the target platform. He has found that given enough volume and variety of training data, this approach can often outperform the cross-platform compiler approaches in terms of in terms of absolute performance. However, it also requires some manual work during training, and in pre- and post-processing.
Common tradeoffs between edge and cloud AI
Accenture's Nebolsky said some of the most common tradeoffs developers need to consider between cloud and edge AI include the following:
- Processing power: Edge computing devices are typically less powerful and difficult to replace or upgrade.
- Latency: The cloud is fast, but not ready for real-time applications like driving a car or industrial controls.
- Energy consumption: Most designers typically don't have to consider energy consumption constraints with cloud the way they do with the edge.
- Connectivity: Safety-critical services like autonomous vehicles can't afford to stop working when connectivity drops, which can push processing for real-time AI-driven decisions to the edge.
Security: AI services that drive authentication and processing of sensitive information like fingerprints or medical records are generally best accomplished locally for security concerns. Even when very strong cloud security is in place, the user perception of better privacy from edge processing can be an important consideration.