AWS Auto Scaling Group Deployment Strategy
This content is translated by Notion AI. Original content is written by a Korean.
Overview
This article covers EC2 deployment strategies utilizing AutoScalingGroup (ASG). It requires some background on the key features of ASG and the deployment pipeline.
- A summary of AutoScalingGroup key features: https://malwareanalysis.tistory.com/870
- EC2 deployment pipeline using AutoScalingGroup: https://malwareanalysis.tistory.com/871
Lab Structure
The lab materials, including RollingUpdate and Canary, are available on my GitHub.
- RollingUpdate - https://github.com/choisungwook/portfolio/tree/master/aws/auto_scaling_group/examples/02_rollingupdate
- Canary - https://github.com/choisungwook/portfolio/tree/master/aws/auto_scaling_group/examples/02_canary
- Blue/green (using Goployer open source) - https://github.com/choisungwook/portfolio/tree/master/aws/auto_scaling_group/examples/02_canary
The EC2 instance created by ASG runs nginx, which outputs the current version. For example, if the launch template version is v1, the nginx response will also be v1.
The ASG is connected to an ALB. When you call the ALB, it receives the nginx response running on the EC2 instance.
In each scenario, if the deployment is successful, you receive an nginx v2 response, which is the updated version when you call the ALB.
The ASG runs a total of 4 EC2 instances, so the ASG Desired capacity is 4.
ASG Deployment Strategy - RollingUpdate
What is the RollingUpdate deployment strategy?
The RollingUpdate deployment strategy incrementally creates new versions of instances while sequentially shutting down existing instances.
For example, suppose you have the following ASG settings
- Desired capacity: 4
- MinHealthyPercentage: 50
- MaxHealthyPercentage: 100
When the ASG Instance Refresh runs, it calculates the minimum number of instances to be 2 (50% of the total of 4), so it replaces the instances with 2. Therefore, all 4 instances are replaced in a two-step process.
This method of incrementally selecting and deploying some instances, as illustrated in the example above, is called RollingUpdate. The advantage of RollingUpdate is that it is simple to implement and minimizes the scope for issues because the instances are replaced in stages.
ASG Deployment Strategy - RollingUpdate Lab
1. First, modify the launch template. We modified nginx's index.html file to update the response version information.
When the launch template is modified, its version is automatically updated.
2. Run an instance refresh.
Set the Desired Configuration to RollingUpdate, 2 instances at a time, and then call the Rollback API.
aws autoscaling start-instance-refresh \\
--auto-scaling-group-name example2-rollingupdate \\
--desired-configuration '{
"LaunchTemplate": {
"LaunchTemplateId": "lt-04674a4e772d0e5cf",
"Version": "2"
}
}' \\
--preferences '{
"MinHealthyPercentage": 50,
"MaxHealthyPercentage": 100
}'
After you run an instance refresh, you can check the progress in the AWS console.
You can check the status of your EC2 instances on the Instance management tab. In the current example, we have configured 2 instances for Rolling updates, so 2 instances will be shut down and 2 new instances will be created.
You can also check the progress in the EC2 Instances dashboard.
Enhancing the ASG RollingUpdate example setup
In the current example setup, when replacing an instance, the existing instance was shut down first, and then a new instance was created.
This meant that we temporarily lost two instances during the instance refresh process. If there were a sudden surge of traffic at this point, it could overload the remaining instances and cause a service failure. How can we configure the system to create new instances first and then shut down the existing ones?
The solution is to increase the MaxHealthyPercentage value, currently set to 100, which limits the maximum number of instances to four. Changing it to MaxHealthyPercentage=150 will allow up to 6 instances to run simultaneously, with new instances being created first and then existing instances being shut down.
Increasing MaxHealthyPercentage allows you to safely deploy instances in production. However, the downside is that the number of instances increases during deployment, which increases costs, so you should set MaxHealthyPercentage appropriately for your team's situation.
ASG deployment strategy - Canary
What is the Canary strategy?
A Canary deployment incrementally replaces EC2 instances, similar to RollingUpdate, but with a pause in the deployment process. During this pause, the newly deployed features are monitored for issues. If monitoring shows no problems, the deployment continues, and if problems are found, a rollback is run.
Setting up Canary deployments in ASG
ASG provides a checkpoint feature for your Canary deployment strategy.
When the percentage of replaced instances reaches the rate set in the Checkpoint, instance refreshes are suspended. After waiting the CheckpointDelay time, instance refreshes resume.
- CheckPointPercentages: Pause points for validation
- CheckpointDelay: Pause time
aws autoscaling start-instance-refresh \\
--auto-scaling-group-name {ASG이름} \\
--preferences '{
"CheckpointPercentages": [25, 50, 100],
"CheckpointDelay": 300
}'
When using the Canary strategy with ASG, it is recommended that you set up a Desired Configuration with it so that you can call the rollback API. Because Canary deployments take longer than other strategies, it's essential to be prepared to call the rollback API to roll back immediately if you discover an issue.
aws autoscaling start-instance-refresh \\
--auto-scaling-group-name example2-canary-asg \\
--desired-configuration '{
"LaunchTemplate": {
"LaunchTemplateId": "lt-0e984424fe3ba9f46",
"Version": "2"
}
}' \\
--preferences '{
"MinHealthyPercentage": 50,
"MaxHealthyPercentage": 100,
"CheckpointPercentages": [20, 50, 80],
"CheckpointDelay": 300
}'
A word of caution
If you do not include 100 in CheckPointPercentages, not all instances are replaced.
For example, the example below replaces only 25% of all instances, leaving the remaining 75% intact. This strategy of replacing only some instances is called a partial refresh in ASG.
aws autoscaling start-instance-refresh \\
--auto-scaling-group-name {ASG이름} \\
--preferences '{
"CheckpointPercentages": [10, 25],
"CheckpointDelay": 300
}'
To deploy the remaining instances after a partial refresh, use the SkipMatching feature. This feature skips those instances if they make no difference to the deployment.
aws autoscaling start-instance-refresh \\
--auto-scaling-group-name {ASG이름} \\
--preferences '{
"CheckpointPercentages": [10, 30, 100],
"CheckpointDelay": 300,
"SkipMatching": true
}'
Hands-on
As in the UpdateRolling strategy lab, we modified the launch template and ran an instance refresh. In the Canary deployment strategy, CheckpointPercentages and CheckpointDelay are added to the instance refresh arguments.
aws autoscaling start-instance-refresh \\
--auto-scaling-group-name {ASG이름] \\
--desired-configuration '{
"LaunchTemplate": {
"LaunchTemplateId": "lt-",
"Version": "2"
}
}' \\
--preferences '{
"MinHealthyPercentage": 50,
"MaxHealthyPercentage": 100,
"CheckpointPercentages": [50, 100],
"CheckpointDelay": 300
}'
After setting the checkpoints, we see Checkpoints=Enabled in the instance refresh screen, as shown below.
As of 2025, there isn't a way to check the time remaining after the Checkpoint is reached, so the best we can do is listen for instance refresh events with EventBridge and send notifications with Lambda.
Reference: https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-adding-checkpoints-instance-refresh.html |
ASG Deployment Strategy - Blue/Green
Blue/Green is a strategy that runs old and new EC2 instances simultaneously, and then switches traffic to the latest version at once. This is typically used by API applications that require traffic routing. After you have fully switched traffic to the new version, you keep the old version instance for a period in case of a rollback. If all goes well, delete the old version instances.
Unfortunately, as of 2025, ASG does not support the Blue/Green deployment strategy as a feature of its own; it currently only offers RollingUpdate, and if you need Canary, you can use Checkpoint.
While ASG doesn't have Blue/Green as a feature of its own, you can leverage ASG to implement Blue/Green. You need two ASGs. You can deploy similarly to Blue/Green by creating an ASG that manages the newer EC2 instances, connecting it to an ELB (such as ALB), and removing the older ASG. However, the instances are not attached to the ELB TargetGroup at precisely the same time, so traffic conversion is not 100% at once.
Comments
Post a Comment