Profile Applicability:
• Level 2
Description:
Monitoring EC2 instances and EBS volumes using Amazon CloudWatch enables real-time tracking of resource health, performance, and usage. CloudWatch can collect and track metrics, collect log files, and set alarms to notify administrators about critical issues such as high CPU usage or low disk space. By enabling CloudWatch monitoring for EC2 and EBS, you gain the visibility needed to proactively manage resource performance and optimize costs.
Rationale:
Enabling CloudWatch monitoring for EC2 instances and EBS volumes provides key performance metrics and operational insights, such as CPU utilization, memory, disk I/O, and network activity. This allows administrators to detect issues early, scale resources appropriately, and ensure application reliability. CloudWatch alarms further enhance visibility by providing notifications when thresholds are exceeded, ensuring a timely response to operational concerns.
Impact:
Pros:
Provides real-time monitoring of EC2 instances and EBS volumes
Allows administrators to receive alerts for potential issues (e.g., high CPU usage or low available disk space)
Helps optimize resource usage by identifying underutilized or overutilized instances
Enhances incident response by notifying administrators of critical thresholds
Supports cost management by tracking resource utilization
Cons:
Additional costs associated with monitoring metrics and setting up CloudWatch alarms
May result in increased noise if alarms are not configured with proper thresholds
Requires periodic review of alarms and metrics to ensure they are still relevant and helpful
Default Value:
By default, AWS provides basic EC2 monitoring (every 5 minutes) at no additional cost. Detailed monitoring (1-minute granularity) and additional CloudWatch alarms are not enabled by default and must be configured manually.
Pre-requisites:
IAM permissions to manage CloudWatch, EC2, and EBS monitoring (e.g., cloudwatch:PutMetricAlarm, ec2:DescribeInstances, ec2:MonitorInstances)
Understanding of critical metrics to monitor (e.g., CPU usage, disk space, memory, I/O operations)
CloudWatch alarm thresholds and notification methods configured
Remediation:
Test Plan:
Using AWS Console:
Sign in to the AWS Management Console
Navigate to CloudWatch > Alarms
Ensure that CloudWatch alarms are set up for EC2 and EBS metrics, such as:
CPU utilization thresholds (e.g., 80% for high CPU usage)
Disk space usage or disk I/O operations
Network activity or errors
Verify that EC2 instances are enabled for detailed monitoring if needed (under the Monitoring tab of the EC2 instance details)
Review any existing CloudWatch logs to ensure that key log data from EC2 and EBS is being captured and analyzed
Using AWS CLI:
List the CloudWatch alarms in the account:
aws cloudwatch describe-alarms
Describe specific EC2 instance metrics:
aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization --start-time <start-time> --end-time <end-time> --period 300 --statistics Average --dimensions Name=InstanceId,Value=<instance-id>
Verify that detailed monitoring is enabled for EC2 instances:
aws ec2 describe-instances --instance-ids <instance-id> --query "Reservations[].Instances[].Monitoring"
Check EBS metrics for volume usage and disk I/O:
aws cloudwatch get-metric-statistics --namespace AWS/EBS --metric-name VolumeReadOps --start-time <start-time> --end-time <end-time> --period 300 --statistics Sum --dimensions Name=VolumeId,Value=<volume-id>
Implementation Plan:
Using AWS Console:
Sign in to the AWS Management Console
Navigate to CloudWatch > Alarms
Click Create Alarm
Select EC2 or EBS as the metric source
For EC2, choose metrics like CPU Utilization, Disk Read/Write Operations, or Network Traffic
For EBS, choose metrics like VolumeReadOps, VolumeWriteOps, or VolumeIdleTime
Set thresholds for the metric (e.g., set an alarm for when CPU utilization exceeds 80% for 5 minutes)
Configure actions for the alarm (e.g., send notifications via SNS, trigger Lambda functions, etc.)
Review the alarm settings and click Create Alarm
Using AWS CLI:
Create an alarm for high CPU utilization on an EC2 instance:
aws cloudwatch put-metric-alarm \ --alarm-name HighCPUUtilization \ --metric-name CPUUtilization \ --namespace AWS/EC2 \ --statistic Average \ --period 300 \ --threshold 80 \ --comparison-operator GreaterThanThreshold \ --dimensions Name=InstanceId,Value=<instance-id> \ --evaluation-periods 1 \ --alarm-actions arn:aws:sns:<region>:<account-id>:<sns-topic> \ --ok-actions arn:aws:sns:<region>:<account-id>:<sns-topic>
Enable detailed monitoring for an EC2 instance:
aws ec2 monitor-instances --instance-ids <instance-id>
Set up an alarm for an EBS volume (e.g., for high read operations):
aws cloudwatch put-metric-alarm \ --alarm-name HighEBSReadOps \ --metric-name VolumeReadOps \ --namespace AWS/EBS \ --statistic Sum \ --period 300 \ --threshold 1000 \ --comparison-operator GreaterThanThreshold \ --dimensions Name=VolumeId,Value=<volume-id> \ --evaluation-periods 1 \ --alarm-actions arn:aws:sns:<region>:<account-id>:<sns-topic>
Backout Plan:
Using AWS Console:
Navigate to CloudWatch > Alarms
Select the alarm to delete or modify
Click Actions > Delete to remove the alarm
If needed, disable detailed monitoring for the EC2 instance by navigating to EC2 > Instances, selecting the instance, and clicking Actions > Monitoring > Disable Monitoring
Using AWS CLI:
Delete a CloudWatch alarm:
aws cloudwatch delete-alarms --alarm-names <alarm-name>
Disable detailed monitoring for an EC2 instance:
aws ec2 unmonitor-instances --instance-ids <instance-id>