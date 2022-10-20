Batch processing is difficult. You run the risk of underprovisioning your resources and not having enough to process all your jobs. Or you overprovision and spend significantly more on resources.

Whether you're processing thousands of transactions a day or running an ETL on new data once a month, AWS Batch can run batch jobs. These jobs can be at any scale, and you only pay for the resources you use while those jobs are running. Walk through this AWS Batch tutorial to learn how to set up and use this tool.

Setting up the batch resources Before you run a batch job, set up the necessary resources. Here is a list of resources and what they accomplish: Compute environment. This type of compute resource runs each of your jobs. AWS manages how many of these resources must be created based on this definition and how many jobs are currently queued for processing. You can pick either a Fargate or EC2 configuration. Note that EC2 also supports spot instances for more cost optimization options for your workload.

Job queue. When you submit a new batch job, it waits in a job queue until there is a compute environment ready to process it. The ability to set a priority level on each job queue makes it possible to have higher and lower priority jobs that run simultaneously at the time they are submitted.

Job definition. Your job may require additional configurations to run, such as environment variables, IAM policies and persistent storage attached. You can set CPU and memory usage for each job. If your task is already packaged in a container image, you can define that here as well.

Job. This is the actual unit of work, a single command-line command with any arguments or parameters. You can submit the job through the AWS console, the CLI or any AWS SDK.

Batch job states Once you submit a job to AWS Batch, it moves through several states that describe what the batch service is doing. If jobs spend a long time in one state before they succeed or fail, this can indicate that you need to make changes to your AWS Batch components. Once you submit a job to the batch service, it will inherit the properties defined in the job definition you attach to it. It will pass through four main states -- submitted, pending, starting and runnable -- before it can run successfully. Submitted. In the submitted state, the batch service determines if an instance from the compute environment assigned to that job queue is available to process it. If one isn't available, the batch tries to create a new one based on how you configured your compute environment. Pending. If a job in the queue cannot run because it has dependencies on another resource or job, it is in the pending state. It then moves to runnable once the dependencies are satisfied. Starting. Once a new compute resource is available, the job moves to the pending state where it pulls any container images it needs to run the job. Runnable. Finally, the job moves to the running state. Here, the command in the submitted job executes. If it returns an exit code 0, the batch service moves it to the succeeded state. Otherwise, it is moved to the failed state. In either case, if there are no other jobs in the job queue, the compute resource that ran the job is destroyed.