pressmaster - Fotolia
In the past, if AWS users wanted to integrate custom components of other Amazon cloud services into their application, they'd need to write their own custom code to do it. However, Step Functions integrations can offer a simpler path for complex applications.
Developers use AWS Step Functions to define and manage multicomponent workflows on Amazon's cloud. With Step Functions, you configure the different tasks, sequences and conditions that are part of any workflow. It also delivers a GUI to do this visually.
For Step Functions workflows, you call custom application components or AWS APIs to execute tasks, or state machines. These workflows were initially limited to Lambda functions and custom workers tied to any endpoint with an HTTP connection -- such as EC2 instances, Elastic Container Service (ECS) containers or on-prem servers.
However, AWS recently added eight more integrations for Step Functions -- Amazon DynamoDB, Simple Queue Service (SQS), Simple Notification Service (SNS), AWS Glue, AWS Batch, SageMaker, ECS and AWS Fargate. Because custom workers can now be called through an AWS API, it reduces the need to write custom code and simplifies workflow implementation and application maintenance. Let's break down the use and fit for each of these AWS Step Functions actions.
You can use this integration to manipulate data stored in a DynamoDB table. Since most workflows have to either access or update data, it's useful to do so without writing additional components. Define a task that integrates with a DynamoDB table, and then configure the API operation and data you want to exchange.
You can use SQS to decouple application components. On one end, applications send messages to a queue, and on the other end, workers process incoming messages. With this integration, you can send messages to an existing SQS queue, though Step Functions can't receive messages directly from SQS.
This is useful when you want to trigger asynchronous processes. You can send a message to an SQS queue when your state machine execution completes, or you can do it in the middle of your workflow execution. Just keep in mind that, if you need to know the status of a process triggered by a message sent to SQS, you have to write additional code.
Similar to the SQS integration, you can send a message to an SNS topic either in the middle of your workflow execution or at the end. Since SNS integrates with any HTTPS endpoint, you can use this integration to call virtually any API as part of your workflow execution. You can also use it to send mobile notifications or emails to keep your application users informed on a workflow execution's status.
With this integration, you can trigger jobs you've created using AWS Glue, which manages extract, transform and load jobs. Because jobs start synchronously, the state machine automatically waits for a success or failure status when the AWS Glue job completes. This integration is a good fit for workflows that need to process data, especially large volumes. Since AWS Glue is serverless, you don't have to launch any EC2 instances.
This integration can also be used for synchronous jobs. The state machine will automatically wait for the batch job execution status and get a response, which you can use to determine the next step in your workflow. Unlike AWS Glue, Batch launches and manages EC2 instances.
With the integration into this machine learning service, you can start jobs that train an existing model by calling the CreateTrainingJob API or jobs that get inference data by using the CreateTransformJob API. You need a machine learning model in place to use this integration. And because it's also synchronous, the workflow task will wait for a job completion before it can continue with next steps.
Amazon ECS and Fargate
Use this integration to launch containers that will be deployed to EC2 instances or Fargate. The ECS task launches synchronously, and the state machine receives a status from ECS as soon as the essential container of the launched task completes. This integration is useful when you want to run custom logic that isn't necessarily suited for a Lambda function or if you want to reuse code that is ready to run on Docker containers.
Step Functions limitations and future integrations
If you only want to trigger a long-running job and don't want to track its status, you can do so with a Lambda function that's triggered from the state machine and calls the relevant AWS API. However, users should be aware of Step Functions' limit of 32,768 characters of data for a task input or output. Even if a particular AWS API could support a heavier payload, you're still constrained by this hard limit.
These added Step Functions integrations provide a clean way to interact with existing or new application components and require minimal development effort or infrastructure setup. Users also benefit from the fact that the state machine automatically polls for the status of potentially long-running jobs, since it simplifies the workflow implementation for these tasks.
You can also configure Step Functions as a target from other AWS components, such as AWS IoT Rules Engine, API Gateway or CloudWatch Events. This way you can further simplify application development and start workflow executions with AWS built-in integrations, which minimizes the need for custom code.
As with many other Amazon services, expect more built-in integrations for Step Functions in the future. This should improve the implementation of a wide range of workflows and Step Functions use cases.