AWS Auto Scaling Cheat Sheet

Last updated on August 30, 2023

Bookmarks

Features
Amazon EC2 Auto Scaling
Application Auto Scaling
Monitoring
Security
AWS Auto Scaling-related Cheat Sheets
Validate Your Knowledge

AWS Auto Scaling Cheat Sheet

Configure automatic scaling for the AWS resources quickly through a scaling plan that uses dynamic scaling and predictive scaling.
Optimize for availability, for cost, or a balance of both.
Scaling in means decreasing the size of a group while scaling out means increasing the size of a group.
Useful for
- Cyclical traffic such as high use of resources during regular business hours and low use of resources overnight
- On and off traffic patterns, such as batch processing, testing, or periodic analysis
- Variable traffic patterns, such as software for marketing campaigns with periods of spiky growth
It is a region specific service.

Features

Launch or terminate EC2 instances in an Auto Scaling group.
Launch or terminate instances from an EC2 Spot Fleet request, or automatically replace instances that get interrupted for price or capacity reasons.
Adjust the ECS service desired count up or down in response to load variations.
Enable a DynamoDB table or a global secondary index to increase or decrease its provisioned read and write capacity to handle increases in traffic without throttling.
Dynamically adjust the number of Aurora read replicas provisioned for an Aurora DB cluster to handle changes in active connections or workload.
Use Dynamic Scaling to add and remove capacity for resources to maintain resource utilization at the specified target value.

Use Predictive Scaling to forecast your future load demands by analyzing your historical records for a metric. It also allows you to schedule scaling actions that proactively add and remove resource capacity to reflect the load forecast, and control maximum capacity behavior. Only available for EC2 Auto Scaling groups.
AWS Auto Scaling scans your environment and automatically discovers the scalable cloud resources underlying your application, so you don’t have to manually identify these resources one by one through individual service interfaces.
You can suspend and resume any of your AWS Application Auto Scaling actions.
A warm pool allows you to decrease latency for applications that have exceptionally long boot times. This will help avoid over-provisioning your Auto Scaling groups in order to manage latency and improve application performance.

Amazon EC2 Auto Scaling

Ensuring you have the correct number of EC2 instances available to handle your application load using Auto Scaling Groups.
An Auto Scaling group contains a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of instance scaling and management.
You specify the minimum, maximum and desired number of instances in each Auto Scaling group.
Key Components

Groups	Your EC2 instances are organized into groups so that they are treated as a logical unit for scaling and management. When you create a group, you can specify its minimum, maximum, and desired number of EC2 instances.
Configuration templates	Your group uses a launch template as a template for its EC2 instances. When you create a launch template, you can specify information such as the AMI ID, instance type, key pair, security groups, and block device mapping for your instances.
Scaling options	How to scale your Auto Scaling groups.

Auto Scaling Lifecycle

You can add a lifecycle hook to your Auto Scaling group to perform custom actions when instances launch or terminate.
- Applies to instances launched or terminated
- Maximum instance lifetime
- Instance refresh
- Capacity rebalancing
- Warm pools
Scaling Options
- Scale to maintain current instance levels at all times
- Manual Scaling
- Scale based on a schedule
- Scale based on a demand
- Use predictive scaling
Scaling Policy Types
- Target tracking scaling—Increase or decrease the current capacity of the group based on a target value for a specific metric.
- Step scaling—Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
- Simple scaling—Increase or decrease the current capacity of the group based on a single scaling adjustment.
The size of your Auto Scaling group is restricted by capacity limits, which can be resized between the minimum and maximum size limits.
The cooldown period is a configurable setting that helps ensure to not launch or terminate additional instances before previous scaling activities take effect.
- EC2 Auto Scaling supports cooldown periods when using simple scaling policies, but not when using target tracking policies, step scaling policies, or scheduled scaling.
You can use the default instance warmup to improve CloudWatch metrics used for dynamic scaling. This feature lets your EC2 instances finish warming up before they contribute the usage data.
Dynamic scaling can better react to the demand curve of your application if you utilize a target tracking scaling policy based on a custom Amazon SQS queue metric.
Amazon EC2 Auto Scaling marks an instance as unhealthy if the instance is in a state other than running, the system status is impaired, or Elastic Load Balancing reports that the instance failed the health checks.
Termination of Instances
- When you configure automatic scale in, you must decide which instances should terminate first and set up a termination policy. You can also use instance protection to prevent specific instances from being terminated during automatic scale in.
- Default Termination Policy

Custom Termination Policies
- OldestInstance – Terminate the oldest instance in the group.
- NewestInstance – Terminate the newest instance in the group.
- OldestLaunchConfiguration – Terminate instances that have the oldest launch configuration.
- ClosestToNextInstanceHour – Terminate instances that are closest to the next billing hour.
An instance can be temporarily removed from an Auto Scaling group by changing its state from InService into Standby.
You can create launch templates that specifies instance configuration information when you launch EC2 instances, and allows you to have multiple versions of a template.
A launch configuration is an instance configuration template that an Auto Scaling group uses to launch EC2 instances, and you specify information for the instances.
- You can specify your launch configuration with multiple Auto Scaling groups.
- You can only specify one launch configuration for an Auto Scaling group at a time, and you can’t modify a launch configuration after you’ve created it.
- When you create a VPC, by default its tenancy attribute is set to default. You can launch instances with a tenancy value of dedicated so that they run as single-tenancy instances. Otherwise, they run as shared-tenancy instances by default.
- If you set the tenancy attribute of a VPC to dedicated, all instances launched in the VPC run as single-tenancy instances.
- When you create a launch configuration, the default value for the instance placement tenancy is null and the instance tenancy is controlled by the tenancy attribute of the VPC.

Launch Configuration Tenancy	VPC Tenancy = default	VPC Tenancy = dedicated
not specified	shared-tenancy instance	Dedicated Instance
default	shared-tenancy instance	Dedicated Instance
dedicated	Dedicated Instance	Dedicated Instance

- If you are launching the instances in your Auto Scaling group in EC2-Classic, you can link them to a VPC using ClassicLink.

Application Auto Scaling

- Allows you to configure automatic scaling for the following resources:
  - Amazon ECS services
  - Spot Fleet requests
  - Amazon EMR clusters
  - ElastiCache for Redis clusters
  - Amazon Neptune clusters
  - AppStream 2.0 fleets
  - Amazon Comprehend
  - DynamoDB tables and global secondary indexes
  - Amazon Keyspaces tables
  - Aurora replicas
  - Amazon SageMaker endpoint variants
  - Lambda function provisioned concurrency
  - Amazon Managed Streaming for Apache Kafka
  - Custom resources provided by your own applications or services.
- Features
  - Target tracking scaling—Scale a resource based on a target value for a specific CloudWatch metric.
  - Step scaling— Scale a resource based on a set of scaling adjustments that vary based on the size of the alarm breach.
  - Scheduled scaling—Scale a resource based on the date and time. The timezone can either be in UTC or in your local timezone.
- Target tracking scaling
  - You can have multiple target tracking scaling policies for a scalable target, provided that each of them uses a different metric.
  - You can also optionally disable the scale-in portion of a target tracking scaling policy.
- Step scaling
  - Increase or decrease the current capacity of a scalable target based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
- Scheduled scaling
  - Scale your application in response to predictable load changes by creating scheduled actions, which tell Application Auto Scaling to perform scaling activities at specific times.
- The scale out cooldown period is the amount of time, in seconds, after a scale out activity completes before another scale out activity can start.
- The scale in cooldown period is the amount of time, in seconds, after a scale in activity completes before another scale in activity can start.
You can attach one or more classic ELBs to your existing Auto Scaling Groups. The ELBs must be in the same region.
Auto Scaling rebalances by launching new EC2 instances in the AZs that have fewer instances first, only then will it start terminating instances in AZs that had more instances

Monitoring

- Health checks – identifies any instances that are unhealthy
  - Amazon EC2 status checks (default)
  - Elastic Load Balancing health checks
  - Custom health checks.
- Auto scaling does not perform health checks on instances in the standby state. Standby state can be used for performing updates/changes/troubleshooting without health checks being performed or replacement instances being launched.
- CloudWatch metrics – enables you to retrieve statistics about Auto Scaling-published data points as an ordered set of time-series data, known as metrics. You can use these metrics to verify that your system is performing as expected.
- CloudWatch Events – Auto Scaling can submit events to CloudWatch Events when your Auto Scaling groups launch or terminate instances, or when a lifecycle action occurs.
- SNS notifications – Auto Scaling can send Amazon SNS notifications when your Auto Scaling groups launch or terminate instances.
- CloudTrail logs – enables you to keep track of the calls made to the Auto Scaling API by or on behalf of your AWS account, and stores the information in log files in an S3 bucket that you specify.

AWS Auto Scaling Security

- Use IAM to help secure your resources by controlling who can perform AWS Auto Scaling actions.
- By default, a brand new IAM user has NO permissions to do anything. To grant permissions to call Auto Scaling actions, you attach an IAM policy to the IAM users or groups that require the permissions it grants.

Capacity Management Made Easy with Amazon EC2 Auto Scaling:

AWS Auto Scaling-related Cheat Sheets:

EC2 Instance Health Check vs ELB Health Check vs Auto Scaling and Custom Health Check

Validate Your Knowledge

Question 1

A large Philippine-based Business Process Outsourcing company is building a two-tier web application in their VPC to serve dynamic transaction-based content. The data tier is leveraging an Online Transactional Processing (OLTP) database but for the web tier, they are still deciding what service they will use.

What AWS services should you leverage to build an elastic and scalable web tier?

Elastic Load Balancing, Amazon EC2, and Auto Scaling
Elastic Load Balancing, Amazon RDS with Multi-AZ, and Amazon S3
Amazon RDS with Multi-AZ and Auto Scaling
Amazon EC2, Amazon DynamoDB, and Amazon S3

Show me the answer!

Correct Answer: 1

Amazon RDS is a suitable database service for online transaction processing (OLTP) applications. However, the question asks for a list of AWS services for the web tier and not the database tier. Also, when it comes to services providing scalability and elasticity for your web tier, Auto Scaling and Elastic Load Balancer should immediately come into mind. Therefore, the correct answer is: Elastic Load Balancing, Amazon EC2, and Auto Scaling.

To build an elastic and a highly-available web tier, you can use Amazon EC2, Auto Scaling, and Elastic Load Balancing. You can deploy your web servers on a fleet of EC2 instances to an Auto Scaling group, which will automatically monitor your applications and automatically adjust capacity to maintain steady, predictable performance at the lowest possible cost. Load balancing is an effective way to increase the availability of a system. Instances that fail can be replaced seamlessly behind the load balancer while other instances continue to operate. Elastic Load Balancing can be used to balance across instances in multiple availability zones of a region.

The rest of the options are incorrect since they don’t mention all of the required services in building a highly available and scalable web tier, such as EC2, Auto Scaling, and Elastic Load Balancer. Although Amazon RDS with Multi-AZ and DynamoDB are highly scalable databases, the scenario is more focused on building its web tier and not the database tier.

References:
https://media.amazonwebservices.com/architecturecenter/AWS_ac_ra_ftha_04.pdf

Note: This question was extracted from our AWS Certified Solutions Architect Associate Practice Exams.

Question 2

A tech company has a CRM application hosted on an Auto Scaling group of On-Demand EC2 instances with different instance types and sizes. The application is extensively used during office hours from 9 in the morning to 5 in the afternoon. Their users are complaining that the performance of the application is slow during the start of the day but then works normally after a couple of hours.

Which of the following is the MOST operationally efficient solution to implement to ensure the application works properly at the beginning of the day?

Which of the following can be done to ensure that the application works properly at the beginning of the day?

Configure a Dynamic scaling policy for the Auto Scaling group to launch new instances based on the CPU utilization.
Configure a Dynamic scaling policy for the Auto Scaling group to launch new instances based on the Memory utilization.
Configure a Scheduled scaling policy for the Auto Scaling group to launch new instances before the start of the day.
Configure a Predictive scaling policy for the Auto Scaling group to automatically adjust the number of Amazon EC2 instances

Show me the answer!

Correct Answer: 3

Scaling based on a schedule allows you to scale your application in response to predictable load changes. For example, every week the traffic to your web application starts to increase on Wednesday, remains high on Thursday, and starts to decrease on Friday. You can plan your scaling activities based on the predictable traffic patterns of your web application.

To configure your Auto Scaling group to scale based on a schedule, you create a scheduled action. The scheduled action tells Amazon EC2 Auto Scaling to perform a scaling action at specified times. To create a scheduled scaling action, you specify the start time when the scaling action should take effect and the new minimum, maximum, and desired sizes for the scaling action. At the specified time, Amazon EC2 Auto Scaling updates the group with the values for minimum, maximum, and desired size specified by the scaling action. You can create scheduled actions for scaling one time only or for scaling on a recurring schedule.

Hence, configuring a Scheduled scaling policy for the Auto Scaling group to launch new instances before the start of the day is the correct answer. You need to configure a Scheduled scaling policy. This will ensure that the instances are already scaled up and ready before the start of the day since this is when the application is used the most.

The following options are both incorrect. Although these are valid solutions, it is still better to configure a Scheduled scaling policy as you already know the exact peak hours of your application. By the time either the CPU or Memory hits a peak, the application already has performance issues, so you need to ensure the scaling is done beforehand using a Scheduled scaling policy:

-Configure a Dynamic scaling policy for the Auto Scaling group to launch new instances based on the CPU utilization

-Configure a Dynamic scaling policy for the Auto Scaling group to launch new instances based on the Memory utilization

The option that says: Configure a Predictive scaling policy for the Auto Scaling group to automatically adjust the number of Amazon EC2 instances is incorrect. Although this type of scaling policy can be used in this scenario, it is not the most operationally efficient option. Take note that the scenario mentioned that the Auto Scaling group consists of Amazon EC2 instances with different instance types and sizes. Predictive scaling assumes that your Auto Scaling group is homogenous, which means that all EC2 instances are of equal capacity. The forecasted capacity can be inaccurate if you are using a variety of EC2 instance sizes and types on your Auto Scaling group.

References:

https://docs.aws.amazon.com/autoscaling/ec2/userguide/schedule_time.html
https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-scheduled-scaling.html
https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-predictive-scaling.html#predictive-scaling-limitations

Note: This question was extracted from our AWS Certified Solutions Architect Associate Practice Exams.

For more AWS practice exam questions with detailed explanations, visit the Tutorials Dojo Portal:

Additional Training Materials: AWS Auto Scaling Video Courses on Udemy

Amazon EC2 Master Class (with Auto Scaling & Load Balancer)

AWS Auto Scaling Cheat Sheet References:

https://docs.aws.amazon.com/autoscaling/plans/userguide/what-is-aws-auto-scaling.html
https://aws.amazon.com/autoscaling/features/
https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html
https://docs.aws.amazon.com/autoscaling/application/userguide/what-is-application-auto-scaling.html
https://aws.amazon.com/autoscaling/pricing/
https://aws.amazon.com/autoscaling/faqs/

Written by: Jon Bonso

Jon Bonso is the co-founder of Tutorials Dojo, an EdTech startup and an AWS Digital Training Partner that provides high-quality educational materials in the cloud computing space. He graduated from Mapúa Institute of Technology in 2007 with a bachelor's degree in Information Technology. Jon holds 10 AWS Certifications and is also an active AWS Community Builder since 2020.

AWS Auto Scaling

AWS Auto Scaling

Bookmarks

AWS Auto Scaling Cheat Sheet

Features

Amazon EC2 Auto Scaling

Application Auto Scaling

Monitoring

AWS Auto Scaling Security