Enterprises that run applications reliant on machine learning must implement mechanisms to build, train, deploy and access ML models for inference tasks. While ML models commonly run on back-end infrastructure, such as cloud servers, demand is growing for applications that access ML models to be deployed close to the edge devices where use cases occur.
This demand is often due to specific requirements related to bandwidth and fast response times. An architecture that relies exclusively on cloud-based back-end servers for real-time ML inference can't necessarily meet these requirements. Take, for example, applications deployed on IoT devices, such as industrial automation, smart homes, sensors and vehicles. These applications require minimal latency and the ability to access ML insights in real time, often from remote locations that have unreliable access to the cloud.
Amazon SageMaker Edge Manager has played an important role for edge-based ML applications because it enables ML functionality and makes it available to edge components. It has helped enterprises manage large numbers of remote devices and provides the ability to deploy ML models to them. It also monitors system health and accuracy for those devices.
Amazon announced the tool in late 2020 but is decommissioning it on April 26, 2024. AWS recommended the following alternatives to manage ML deployments on edge devices:
- Open Neural Network Exchange (ONNX).
- AWS IoT Greengrass V2.
While Amazon hasn't officially explained why it's decommissioning SageMaker Edge Manager, the tool overlaps with AWS IoT Greengrass, as both manage and monitor remote devices and enable ML models.
How ONNX and AWS IoT Greengrass work
ONNX is an open source tool that supports multiple runtimes on edge devices. ONNX makes it possible to deploy ML models on the edge across a range of supported hardware, OSes and frameworks. Using ONNX to implement ML edge-based functionality enables application portability across different types of devices. It can also deliver better performance in inference tasks. Enterprises can use a tool such as Amazon SageMaker Neo to compile ONNX models, which can result in highly optimized deployments for specific hardware devices.
AWS IoT Greengrass is a cloud-based offering that enables application owners to manage device fleets and build and manage the software deployed on managed IoT edge devices. This includes ML inference functionality implemented using ONNX. The deployed software executes locally, but it also enables devices to send select data, such as metrics or inference results, to the cloud to monitor system health and accuracy.
Once data arrives at AWS, IoT Greengrass can integrate with other Amazon services, such as S3, Kinesis, CloudWatch and SageMaker, among others. Enterprises can use these integrations for cloud resources and to expand the functionality implemented on edge devices. For example, they could execute advanced analytics or automated actions in response to certain events that take place on the edge.
AWS IoT Greengrass uses ML models built and trained in the cloud and deploys them on edge devices. Enterprises can choose these models from a variety of available pre-trained ones, or they can custom build, compile and train models using SageMaker features, such as Studio or Neo. To ensure a wide range of compatibility with IoT and edge devices, enterprises should ideally make these models available in ONNX format.
Once ONNX packages are ready for deployment, they must be stored in an S3 bucket accessible by IoT Greengrass and published as IoT Greengrass components. When teams finish this step, they can deploy components to targets defined as Greengrass devices in the AWS IoT service. This step enables ML inference functionality to be available on edge devices, as well as libraries that support communication with AWS IoT Core.
For these devices to be visible to AWS IoT, they also need to run the IoT Greengrass client software. The application software that runs on these devices can implement custom logic to publish data to IoT Core by calling the PublishToIoTCore API available in the AWS IoT Device SDK. This enables application monitoring and automation tasks that run in the cloud by using the integrations AWS IoT supports with other AWS services.
Combine services to manage ML edge applications
It can be a time-consuming process to implement and manage ML functionality on the edge. Enterprises can consider using a tool such as AWS IoT Greengrass to manage edge devices, fleet deployments and monitoring tasks. This service, along with long-term SageMaker features that manage the development, building and training of ML models, is a solid combination of AWS services. Application owners can use these tools to build and maintain ML applications on the edge in a reliable way.