Auto scaling Java REST APIs using Amazon ECS with Fargate

  • Paul - Software Architect
    Paul Minasian
    Software Architect

This is the second in the series of blog posts about auto scaling Java REST APIs using native AWS cloud platform services. This blog post is about using the ECS (Elastic Container Service) service with Fargate serverless compute engine to run dockerized Java REST APIs, and be able to automatically scale them. After a brief introduction to the main concepts about ECS with Fargate, you’ll be able to setup a sandbox environment in ECS, run the provided sample Java REST API in the sandbox environment, and continue to further explore the concepts and features of ECS. Using tools such as AWS CloudFormation or Terraform to setup the infrastructure as code is out of scope for this blog post.

In this blog post, the following AWS services are used:

For auto scaling Java REST APIs using the EC2 Auto Scaling service see the previous blog post. Some of the auto scaling and application load balancing concepts that are covered in the previous blog post also apply to the ECS with Fargate service, and thus will not be covered here.

ECS with Fargate Service

When using ECS, you need to choose between two types of compute engines:

  • EC2
  • Fargate

Fargate is a serverless compute engine, and thus there are no virtual server machines (EC2 instances) to manage when running your web applications and (micro) services.

In ECS you'll need to create a cluster. A cluster contains one or more services. Each service represents a task definition, and allows additional features such as auto scaling to apply to a task definition. A service can also make use of an application load balancer which can route the traffic to multiple running tasks part of the service. A task definition is the template of a task. It contains one or more container definitions for docker images to run in docker containers when a task is run. Related container definitions can be grouped together in one task definition for which a service is created with auto scaling enabled to allow multiple tasks to run. 

Within the ECS cluster, it is also possible to run tasks without creating a service first. However, auto scaling among other features cannot be applied to the task.

See below for a class model that represents the relationships among the entities described above.

ECS class model

ECS with Fargate sandbox environment

Setting up the sandbox environment

Components and Services

The sandbox environment consists of the following components and services:

Apollo Missions API

Apollo Missions API is a simple REST API written in Java using the Quarkus framework. This Java application runs in a docker container as part of a task launched by the ECS. Later on, I will describe how to create a docker image, and upload it to ECR to be used by ECS.

The application consists of the following endpoints:

  • /missions/manned
  • /missions/manned/{missionId}
  • /longComputation
  • /health

The first two endpoints provide some basic data regarding the manned Apollo Missions. The /longComputation endpoint is used by the Apache JMeter for load testing purposes. Creating load on the ECS service will trigger an alarm in CloudWatch, and cause a scaling policy to take a scale out action. The /health endpoint is used by the ELB load balancer to monitor the health of the Java application (whether it is running, accepting and successfully processing HTTP requests or not).

Elastic Container Service (ECS)

A cluster is created which will run the sandbox components. Furthermore, a task definition is created from which tasks can be created to run the REST API microservice in a docker container. Finally, a service in ECS is created to run the tasks and enable the auto scaling of the REST API microservice.

Elastic Container Registry (ECR)

A repository is created to which a docker image for the REST API microservice is uploaded. This docker image will be referenced in the task definition of ECS.

Elastic Load Balancing (ELB)

A target group is created with health checks. The /health endpoint of the Java application is used by the health checks. The target group allows the created application load balancer to route the HTTP traffic to the ECS tasks launched by the ECS Auto Scaling service.

CloudWatch

You can use this service to see which alarms went off, and view the ECS and ELB statistics such as for instance the CPU utilization and Request Count Sum.

Setup

The setup in AWS is done using the AWS Management Console. The values mentioned here regarding Availability Zones (AZs) are for the Europe (Frankfurt) eu-central-1 region. You can of course use a different region but then you will need to provide the values for the default public subnets and AZs of your chosen region.

Docker Image in ECR

Create a docker image of the REST API microservice. Within the Apollo Missions API code repository, look for section Run Quarkus in JVM mode in a docker container.

Now, create a repository in ECR (e.g. isaacdeveloperblog) and upload the version 1.0.0 of the REST API to the ECR repository. Once uploaded, click on the image and copy the Image URI. You will need this URI for your task definition in ECS.

ECS Cluster

Navigate to the ECS and click on Clusters. In the Clusters section click on the Create Cluster button. Choose for Networking only which is powered by AWS Fargate.

Provide the following values.

Key

Value

Cluster name

isaacdeveloperblog

Create VPC

Tick the box ‘Create a new VPC for this cluster’.

Leave the default values for CIDR block, Subnet 1 and Subnet 2 as is.


Now click on the Create button and a new ECS cluster will be created for the sandbox environment. It may take a couple of minutes before all resources within the cluster are created.

Task Definition

Go to the Task Definitions section, and click on the Create new Task Definition button. For the launch type select Fargate.

Provide the following values, and create the task definition.

Key

Value

Task Definition Name

apollo-missions-api

Task Role

None (default)

Network Mode

awsvpc (default)

Task execution role

ecsTaskExecutionRole (default)

Task memory (GB)

2GB

Task CPU (vCPU)

1 vCPU

Container Definitions - Container Name 

apollo-missions-api

Container Definitions - Image

Paste here your uploaded Image URI.

Example: 984818921620.dkr.ecr.eu-central-1.amazonaws.com/isaacdeveloperblog/apollo-missions-api:1.0.0

Container Definitions – Port mapping

Container port: 8080

Protocol: tcp

Elastic Load Balancing

Navigate to the EC2 service and click on the Load Balancers section. Click on the Create Load Balancer button, and select the Application Load Balancer.

Now create an ELB application load balancer with the below values which need to be specified or are different from the default values.

Key

Value

Step 1: Configure Load Balancer

Name

apollo-missions-api-lb

Listeners – Load Balancer Protocol

HTTP: 80

VPC

Select the VPC which belongs to the ECS cluster.

Availability Zones

Select all available AZs:

  • eu-central-1a
  • eu-central-1b

Step 2: Configure Security Settings

Step 3: Configure Security Groups

Create a new security group

  • Security group name: apollo-missions-api-lb-sg
  • Type: HTTP
  • Protocol: TCP
  • Port Range: 80
  • Source: Anywhere

Step 4: Configure Routing

Create a new target group

New target group

  • Name: apollo-missions-api-tg
  • Target type: IP
  • Protocol: HTTP
  • Port: 8080

Health checks

  • Protocol: HTTP
  • Path: /health

Service

Go to the Services tab within the newly created ECS cluster, and click on the Create button.

Provide the following values, and create the service.

Key

Value

Step 1: Configure service

Launch type

Fargate

Task Definition

Family: apollo-missions-api

Revision: 1 (latest)

Platform version

LATEST (default)

Cluster

isaacdeveloperblog

Service name

apollo-missions-api

Service type

REPLICA (default)

Number of tasks

1

Minimum healthy percent

100 (default)

Maximum percent

200 (default)

Deployment type

Rolling update (default)

Step 2: Configure network

Cluster VPC

Select the VPC which belongs to the ECS cluster.

Subnets

Select all available subnets of the VPC.

Configure Security Groups

Create a new security group

  • Security group name: apollo-missions-api-ecs-sg
  • Type: Custom TCP
  • Protocol: TCP
  • Port Range: 8080
  • Source: Anywhere

Auto-assign public IP

ENABLED (default)

Health check grace period

30

Load balancing

Application Load Balancer

Load balancer name: apollo-missions-api-lb

Container to load balance

Container name : port: apollo-missions-api:8080:8080

Click on the Add to load balancer button.

  • Production listener port: 80:HTTP
  • Target group name: apollo-missions-api-tg

Step 3: Set Auto Scaling

Service Auto Scaling

Choose for ‘Configure Service Auto Scaling to adjust your service’s desired count’.

Minimum number of tasks

1

Desired number of tasks

1

Maximum number of tasks

10

IAM role for Service Auto Scaling

ecsAutoscaleRole (default)

Scaling policy type

Target tracking

Policy name

CPU-Utilization

ECS service metric

ECSServiceAverageCPUUtilization

Target value

50

Scale-out cooldown period

60 seconds

Scale-in cooldown period

60 seconds

It may take some time before the service is created. When the service is created, a task will be started by the service.

Running the sandbox environment

If you’ve successfully setup the sandbox environment, you should be able to see the first ECS task launched by the ECS Auto Scaling service to meet the desired capacity of 1.

Running the sandbox step 1

Wait until the last status becomes RUNNING.

Upon launching the ECS task, the ECS Auto Scaling service has registered the task with the Target Group so that the application load balancer can route the traffic to the task.

Running the sandbox step 2

You can access the REST API of that task directly, or via the application load balancer.

  • ECS Task: http://<TASK network interface public IPv4 DNS >:8080/missions/manned
  • ELB: http://<ELB load balancer’s DNS name>/missions/manned

Now click on the apollo-missions-api service in ECS and click on the Events tab. In the events tab you can see that the cluster has only started one task and reached a steady state.

Running the sandbox step 3

Let's update the ECS service, and manually set the minimum and the number of desired tasks to 2. Once done, verify in the events tab that a new task has been created.

Running the sandbox step 4

To trigger a scale out action by the ECS Auto Scaling service, you will need to put some extra load on the ECS service. You can use the provided JMeter project as part of the Apollo Missions API source code. The project file is located in the resources/jmeter folder. This project has been created with Apache JMeter version 5.2.1. 

Once the project is opened in JMeter, provide the ELB load balancer’s DNS name for the field Server Name or IP of the /longComputation endpoint. 

Running the sandbox step 5

Hit the play button to send HTTP requests to the ELB load balancer. Ten concurrent users will for a period of 20 minutes continuously send HTTP requests.

Click on the Summary Report and watch the statistics.

Running the sandbox step 6

In CloudWatch you can see that two alarms which are created by the ECS Auto Scaling service are in OK state.

Running the sandbox step 7

After some time, the alarm for the condition ‘CPUUtilization > 50 for 3 datapoints within 3 minutes’ will go off, and this event will trigger a scale out action in ECS.

Running the sandbox step 8

In ECS service, in the Events tab we can see the new log entries mentioning the scale out action. 

Running the sandbox step 9

If we now look in the Tasks tab, we can see that two additional tasks have been launched.

Running the sandbox step 10

The number of ECS tasks increases gradually over time to meet the dynamically adjusted desired capacity by the ECS Auto Scaling service. Once we stop the JMeter tests, the CPU utilization will drop drastically, and thus multiple scale in actions will take place to decrease the number of running ECS tasks until only the minimum number of tasks are left to run.