Profile Applicability:
- Level 2 
Description:
Amazon EMR (Elastic MapReduce) clusters should be launched without assigning public IP addresses to their instances to minimize the attack surface and exposure to the public internet. Instead, EMR clusters should operate within a private subnet and leverage NAT Gateways or VPC Endpoints for secure communication with AWS services.
Ensuring that EMR cluster instances do not have public IPs reduces the risk of unauthorized access and data breaches
Rationale:
- Enhanced Security: Prevents direct access to EMR cluster instances from the internet. 
- Minimized Attack Surface: Reduces the chances of brute-force attacks, unauthorized access, and data leaks. 
- Compliance: Meets security and compliance standards that mandate private network configurations. 
- Controlled Network Access: Encourages the use of bastion hosts, VPNs, or AWS PrivateLink for secure access. 
Impact:
- Pros: - Improved security posture by isolating EMR instances from the public internet. 
- Reduced risk of data breaches and external attacks. 
- Compliance with strict security policies and standards. 
 
- Cons: - May require additional setup for private access via NAT Gateways or VPNs. 
- Direct SSH or web-based access to cluster instances is restricted unless accessed through a bastion host. 
 
Default Value:
- By default, EMR clusters may assign public IPs if launched in a public subnet or if the assignPublicIp setting is enabled. 
- Public IPs are disabled if the cluster is launched in a private subnet without Auto-assign Public IP enabled. 
Pre-Requisite:
- IAM Permissions: - elasticmapreduce:DescribeCluster 
- elasticmapreduce:ListClusters 
- elasticmapreduce:RunJobFlow 
- ec2:DescribeInstances 
 
- Ensure the VPC has appropriate private subnets and NAT Gateways configured for EMR communication with AWS services. 
Remediation:
Test Plan:
Using AWS Console:
- Sign in to the AWS Management Console. 
- Navigate to EMR Dashboard → Clusters. 
- For each EMR Cluster: - Click on the Cluster ID to open its details. 
- Under Hardware, check the Instance Groups or Instance Fleets.  
- Review the EC2 instances attached to the cluster. 
- Verify if the Public IP column is populated. 
- If any instances have a Public IP, the cluster is non-compliant. 
 
Using AWS CLI:
List all active EMR clusters:
aws emr list-clusters --active --query 'Clusters[*].Id' --output text
Check instance details for each cluster:
aws emr list-instances --cluster-id <emr-cluster-id> --query 'Instances[*].{ID:Ec2InstanceId,PublicIP:PublicIpAddress}' -- Expected Output (Pass):
------------------------ | ListInstances | +----------------------+ | ID | PublicIP | | i-1234567890 | None | | i-0987654321 | None | +----------------------+ Fail Output (Public IP assigned): ------------------------ | ListInstances | +----------------------+ | ID | PublicIP | | i-1234567890 | 54.23.11.45| +----------------------+
Implementation Steps:
Using AWS Console:
- Sign in to the AWS Console. 
- Navigate to EMR Dashboard → Create Cluster. 
- Under Network, select a VPC and ensure: - Subnet: Choose a Private Subnet (without Auto-assign Public IP enabled). 
- Auto-assign Public IP: Set to Disabled. 
 
- Proceed with configuring other cluster settings. 
- Click Create Cluster. 
Using AWS CLI:
Launch an EMR Cluster in a Private Subnet (No Public IP):
aws emr create-cluster \ --name "Private EMR Cluster" \ --release-label emr-6.12.0 \ --applications Name=Hadoop Name=Spark \ --ec2-attributes SubnetId=<private-subnet-id>,EmrManagedMasterSecurityGroup=<master-sg>,EmrManagedSlaveSecurityGroup=<core-sg> \ --instance-type m5.xlarge \ --instance-count 3 \ --use-default-roles \ --no-termination-protected \ --region us-east-1
Verify the Cluster Instances:
aws emr list-instances --cluster-id <emr-cluster-id> --query 'Instances[*].{ID:Ec2InstanceId,PublicIP:PublicIpAddress}' --output tableBackout Plan:
If disabling public IPs causes issues:
- Re-launch the EMR cluster in a public subnet with Auto-assign Public IP enabled. 
Using AWS CLI:
aws emr create-cluster \ --name "Public EMR Cluster" \ --release-label emr-6.12.0 \ --applications Name=Hadoop Name=Spark \ --ec2-attributes SubnetId=<public-subnet-id>,EmrManagedMasterSecurityGroup=<master-sg>,EmrManagedSlaveSecurityGroup=<core-sg> \ --instance-type m5.xlarge \ --instance-count 3 \ --use-default-roles \ --region us-east-1
- Consider adding a Bastion Host for secure access to private EMR clusters if public IPs are not an option. 
 
                 