Profile Applicability:

  • Level 1

Description:

Amazon SageMaker is a fully managed service for building, training, and deploying machine learning models. When running training jobs in SageMaker, it is crucial to ensure that VPC settings are configured properly. VPC settings allow you to control the networking of SageMaker training jobs by ensuring that the jobs are isolated from the public internet or can securely access resources within a VPC (e.g., RDS, S3, or other AWS services). This SOP ensures that VPC settings are configured for SageMaker training jobs to ensure secure network communication.

Rationale:

  • Security: Configuring VPC settings ensures that SageMaker training jobs run within a private network, protecting sensitive data from unauthorized access over the internet.

  • Compliance: Many compliance frameworks, such as SOC 2 and HIPAA, require isolation of resources within a VPC for data security. Configuring VPC settings aligns with these requirements.

  • Controlled Network Access: By using VPC settings, you can control access to training jobs, ensure private communication with data stores, and apply security controls such as security groups and NACLs (network access control lists).

Impact:

Pros:

  • Enhanced Security: VPC settings isolate SageMaker training jobs from the public internet, providing better control over access to the training resources.

  • Compliance: Meets security requirements for network isolation that may be required by various compliance standards.

  • Controlled Access: Allows for controlled access to AWS services such as S3 and RDS within the same VPC.

Cons:

  • Increased Complexity: Configuring VPC settings may require careful planning of networking resources such as subnets, security groups, and route tables.

  • Limited Access: Access to the training job from outside the VPC is restricted, which may require additional configuration (e.g., NAT Gateway or VPN) if internet access is needed.

Default Value:

By default, Amazon SageMaker does not configure VPC settings for training jobs. The VPC configuration must be explicitly set during the job creation or modification.

Pre-requisite:

  • AWS IAM Permissions:

    • sagemaker:DescribeTrainingJob

    • ec2:DescribeVpcs

    • ec2:DescribeSubnets

    • ec2:DescribeSecurityGroups

  • AWS CLI installed and configured.

  • SageMaker Training Job is created and operational.

  • VPC is set up with appropriate subnets and security settings.

Test Plan:

Using AWS Console:

  1. Sign in to the AWS Management Console.

  2. Navigate to Amazon SageMaker under Services.

  3. In the SageMaker Dashboard, select Training jobs.

  4. Choose the Training job you want to review.

  5. Under Job details, verify if the VPC settings are configured:

    • VPC ID: Ensure that the job is associated with a specific VPC.

    • Subnet IDs: Check that the job is using private subnets for isolation.

    • Security Groups: Verify that the appropriate security groups are attached to control network access.

  6. If the VPC settings are not configured, proceed with setting them up as described in the Implementation Steps.

Using AWS CLI:

  1. To describe the SageMaker Training Job and check the VPC settings, run:

    aws sagemaker describe-training-job --training-job-name <training-job-name> --query 'TrainingJob.VpcConfig'

  2. Review the output to verify that VPC settings are configured. Ensure that the VpcId, Subnets, and SecurityGroups fields are properly populated. If these fields are empty or not set, the job does not have VPC settings configured.

Implementation Steps:

Using AWS Console:

  1. Sign in to the AWS Management Console and navigate to Amazon SageMaker.

  2. In the SageMaker Dashboard, go to Training jobs and choose the Training job you want to configure.

  3. Under Security configuration, navigate to VPC settings.

  4. Enable VPC configuration:

    • VPC ID: Select the VPC you want the job to use.

    • Subnets: Select private subnets to isolate the job from the internet.

    • Security Groups: Choose the appropriate security groups to control access to the resources.

  5. Save the changes to enable VPC settings.

Using AWS CLI:

  1. To create a SageMaker Training Job with VPC settings, run the following command:

    aws sagemaker create-training-job \
      --training-job-name <training-job-name> \
      --algorithm-specification TrainingImage=<image-uri>,TrainingInputMode=File \
      --input-data-config <input-config> \
      --output-data-config <output-config> \
      --resource-config <resource-config> \
      --vpc-config VpcId=<vpc-id>,Subnets=<subnet-id-1>,<subnet-id-2>,SecurityGroupIds=<sg-id-1>,<sg-id-2>

  1. If the training job already exists and you need to update the VPC settings, use:

    aws sagemaker update-training-job --training-job-name <training-job-name> --vpc-config VpcId=<vpc-id>,Subnets=<subnet-id-1>,<subnet-id-2>,SecurityGroupIds=<sg-id-1>,<sg-id-2>

  2. Verify the VPC settings by running:

    aws sagemaker describe-training-job --training-job-name <training-job-name> --query 'TrainingJob.VpcConfig'

  3. Ensure that the VPC, subnets, and security groups are correctly configured for the training job.

Backout Plan:

Using AWS Console:

  1. If VPC settings cause issues with the training job, sign in to the AWS Management Console.

  2. Navigate to Amazon SageMaker, select the training job, and go to Edit.

  3. Remove the VPC configuration or revert to the previous subnets and security group settings.

  4. Save the changes and ensure that the training job resumes operation without the VPC settings.

Using AWS CLI:

  1. To remove VPC settings, run the following command:

    aws sagemaker update-training-job --training-job-name <TRAINING_JOB_NAME> --vpc-config "" --region <REGION>

  2. Verify that the training job is no longer using the VPC:

    aws sagemaker describe-training-job --training-job-name <TRAINING_JOB_NAME> --region <REGION>

References:

CIS Controls Mapping:

Version

Control ID

Control Description

IG1

IG2

IG3

v8

3.4

Encrypt Data on End-User Devices – Ensure data encryption during file system access.

v8

6.7

Implement Application Layer Filtering and Content Control – Ensure appropriate content filtering is applied to sensitive files.

v8

6.8

Define and Maintain Role-Based Access Control – Implement and manage role-based access for file systems.

v8

14.6

Protect Information Through Access Control Lists – Apply strict access control to file systems.