- Monitoring tool for your AWS resources and applications.
- Display metrics and create alarms that watch the metrics and send notifications or automatically make changes to the resources you are monitoring when a threshold is breached.
- CloudWatch does not aggregate data across regions. Therefore, metrics are completely separate between regions.
Namespaces – a container for CloudWatch metrics.
- There is no default namespace.
- The AWS namespaces use the following naming convention: AWS/service.
Metrics – represents a time-ordered set of data points that are published to CloudWatch.
- Exists only in the region in which they are created.
- Cannot be deleted, but they automatically expire after 15 months if no new data is published to them.
- As new data points come in, data older than 15 months is dropped.
- Each metric data point must be marked with a timestamp. The timestamp can be up to two weeks in the past and up to two hours into the future. If you do not provide a timestamp, CloudWatch creates a timestamp for you based on the time the data point was received.
- By default, several services provide free metrics for resources. You can also enable detailed monitoring, or publish your own application metrics.
- Metric math enables you to query multiple CloudWatch metrics and use math expressions to create new time series based on these metrics.
Dimensions – a name/value pair that uniquely identifies a metric.
- You can assign up to 10 dimensions to a metric.
- Whenever you add a unique dimension to one of your metrics, you are creating a new variation of that metric.
Statistics – metric data aggregations over specified periods of time.
- Each statistic has a unit of measure. Metric data points that specify a unit of measure are aggregated separately.
- You can specify a unit when you create a custom metric. If you do not specify a unit, CloudWatch uses None as the unit.
- A period is the length of time associated with a specific CloudWatch statistic. The default value is 60 seconds.
- CloudWatch aggregates statistics according to the period length that you specify when retrieving statistics.
- For large datasets, you can insert a pre-aggregated dataset called a statistic set.
The lowest value observed during the specified period. You can use this value to determine low volumes of activity for your application.
The highest value observed during the specified period. You can use this value to determine high volumes of activity for your application.
All values submitted for the matching metric added together. Useful for determining the total volume of a metric.
The value of Sum / SampleCount during the specified period. By comparing this statistic with the Minimum and Maximum, you can determine the full scope of a metric and how close the average use is to the Minimum and Maximum. This comparison helps you to know when to increase or decrease your resources as needed.
The count (number) of data points used for the statistical calculation.
The value of the specified percentile. You can specify any percentile, using up to two decimal places (for example, p95.45). Percentile statistics are not available for metrics that include any negative values.
Percentiles – indicates the relative standing of a value in a dataset. Percentiles help you get a better understanding of the distribution of your metric data.
Alarms – watches a single metric over a specified time period, and performs one or more specified actions, based on the value of the metric relative to a threshold over time.
- You can create an alarm for monitoring CPU usage and load balancer latency, for managing instances, and for billing alarms.
- When an alarm is on a dashboard, it turns red when it is in the ALARM state.
- Alarms invoke actions for sustained state changes only.
- Alarm States
- OK—The metric or expression is within the defined threshold.
- ALARM—The metric or expression is outside of the defined threshold.
- INSUFFICIENT_DATA—The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state.
- When you create an alarm, you specify three settings:
- Period is the length of time to evaluate the metric or expression to create each individual data point for an alarm. It is expressed in seconds.
- Evaluation Period is the number of the most recent periods, or data points, to evaluate when determining alarm state.
- Datapoints to Alarm is the number of data points within the evaluation period that must be breaching to cause the alarm to go to the ALARM state. The breaching data points do not have to be consecutive, they just must all be within the last number of data points equal to Evaluation Period.
- For each alarm, you can specify CloudWatch to treat missing data points as any of the following:
- missing—the alarm does not consider missing data points when evaluating whether to change state (default)
- notBreaching—missing data points are treated as being within the threshold
- breaching—missing data points are treated as breaching the threshold
- ignore—the current alarm state is maintained
- For each alarm, you can specify CloudWatch to treat missing data points as any of the following:
- You can now create tags in CloudWatch alarms that let you define policy controls for your AWS resources. This enables you to create resource level policies for your alarms.
- Customizable home pages in the CloudWatch console that you can use to monitor your resources in a single view, even those spread across different regions.
- There is no limit on the number of CloudWatch dashboards you can create.
- All dashboards are global, not region-specific.
- You can add, remove, resize, move, edit or rename a graph. You can metrics manually in a graph.
- Deliver near real-time stream of system events that describe changes in AWS resources.
- Events respond to these operational changes and take corrective action as necessary, by sending messages to respond to the environment, activating functions, making changes, and capturing state information.
- Events – indicates a change in your AWS environment.
- Targets – processes events.
- Rules – matches incoming events and routes them to targets for processing.
- Monitor logs from EC2 instances in real-time
- Monitor CloudTrail logged events
- By default, logs are kept indefinitely and never expire
- Archive log data
- Log Route 53 DNS queries
- CloudWatch Logs Insights enables you to interactively search and analyze your log data in CloudWatch Logs using queries.
- CloudWatch Vended logs are logs that are natively published by AWS services on behalf of the customer. VPC Flow logs is the first Vended log type that will benefit from this tiered model.
- Collect more logs and system-level metrics from EC2 instances and your on-premises servers.
- Needs to be installed.
Authentication and Access Control
- Use IAM users or roles for authenticating who can access
- Use Dashboard Permissions, IAM identity-based policies, and service-linked roles for managing access control.
- A permissions policy describes who has access to what.
- Identity-Based Policies
- Resource-Based Policies
- There are no CloudWatch Amazon Resource Names (ARNs) for you to use in an IAM policy. Use an * (asterisk) instead as the resource when writing a policy to control access to CloudWatch actions.
- You are charged for the number of metrics you have per month
- You are charged per 1000 metrics requested using CloudWatch API calls
- You are charged per dashboard per month
- You are charged per alarm metric (Standard Resolution and High Resolution)
- You are charged per GB of collected, archived and analyzed log data
- There is no Data Transfer IN charge, only Data Transfer Out.
- You are charged per million custom events and per million cross-account events
- Logs Insights is priced per query and charges based on the amount of ingested log data scanned by the query.
5/alarm. This limit cannot be changed.
10/month/customer for free. 5000 per region per account.
1,000,000/month/customer for free.
Up to 1000 dashboards per account.
Up to 100 metrics per dashboard widget.
Up to 500 metrics per dashboard, across all widgets.
These limits cannot be changed.
10/metric. This limit cannot be changed.
Amazon SNS email notifications
1,000/month/customer for free.
Collect Metrics and Logs from Amazon EC2 instances with the CloudWatch Agent:
Amazon CloudWatch-related Cheat Sheets:
Validate Your Knowledge
Which of the following best describes what CloudWatch is?
- A metric repository
- An audit service that records all API calls made to your AWS account
- A rules repository
- An automated security assessment service
There is a new compliance rule in your company that audits every Windows and Linux EC2 instances each month to view any performance issues. They have more than a hundred EC2 instances running in production, and each must have a logging function that collects various system details regarding that instance. The SysOps team will periodically review these logs and analyze their contents using AWS Analytics tools, and the result will need to be retained in an S3 bucket.
In this scenario, what is the most efficient way to collect and analyze logs from the instances with minimal effort?
- Install the unified CloudWatch Logs agent in each instance which will automatically collect and push data to CloudWatch Logs. Analyze the log data with CloudWatch Logs Insights.
- Install AWS SDK in each instance and create a custom daemon script that would collect and push data to CloudWatch Logs periodically. Enable CloudWatch detailed monitoring and use CloudWatch Logs Insights to analyze the log data of all instances.
- Install the AWS Systems Manager Agent (SSM Agent) in each instance which will automatically collect and push data to CloudWatch Logs. Analyze the log data with CloudWatch Logs Insights.
- Install AWS Inspector Agent in each instance which will collect and push data to CloudWatch Logs periodically. Set up a CloudWatch dashboard to properly analyze the log data of all instances.
For more AWS practice exam questions with detailed explanations, check this out:
Additional Training Materials: Amazon CloudWatch Video Courses on Udemy
- AWS MasterClass: Monitoring and DevOps with AWS CloudWatch by TetraNoodle Team
AWS Certifications are consistently among the top paying IT certifications in the world, considering that Amazon Web Services is the leading cloud services platform with almost 50% market share! Earn over $150,000 per year with an AWS certification!