Profile Applicability:
Level 2

Description:
Amazon SageMaker training jobs can be configured to enable network isolation, which restricts the training container’s network access to the Amazon SageMaker training environment only, blocking access to the internet and other AWS resources outside the VPC. Enabling network isolation enhances security by preventing data exfiltration and limiting attack surfaces during training.

Rationale:
Network isolation protects sensitive data and intellectual property during model training by preventing the training container from communicating with unauthorized endpoints. This reduces the risk of data leakage, unauthorized access, and helps meet compliance and regulatory requirements.

Impact:
Pros:

  • Enhances security by isolating training jobs from external networks

  • Mitigates risk of data exfiltration and unauthorized resource access

  • Supports compliance with data protection regulations

Cons:

  • May restrict access to external data sources unless properly configured with VPC endpoints or proxies

  • Requires careful network and IAM configuration to enable necessary data access

Default Value:
Network isolation is disabled by default and must be explicitly enabled for each training job.

Pre-requisites:

  • IAM permissions to create and describe training jobs (sagemaker:CreateTrainingJob, sagemaker:DescribeTrainingJob)

  • Proper VPC, subnet, and security group configuration to allow required data flow within isolated environments

Test Plan:

Using AWS Console:

  1. Sign in to AWS Management Console.

  2. Navigate to Amazon SageMaker > Training jobs.

  3. Select a training job to inspect.

  4. Review the job’s Network isolation setting under the job details.

  5. Confirm that network isolation is enabled (checked).

Using AWS CLI:

1. List SageMaker training jobs:

aws sagemaker list-training-jobs

2. For each training job, describe the job and check the network isolation status:

aws sagemaker describe-training-job --training-job-name <JOB_NAME> --query EnableNetworkIsolation

3.Confirm the output returns true.

Implementation Plan:

Using AWS Console:

  1. When creating a new training job, navigate to the Network section.

  2. Enable the checkbox for Enable network isolation.

  3. Configure VPC, subnets, and security groups as needed.

  4. Complete the remaining training job configurations and start the job.

Using AWS CLI:

  1. Create a training job with network isolation enabled using the following command:

    aws sagemaker create-training-job \  --training-job-name <JOB_NAME> \  --role-arn <IAM_ROLE_ARN> \  --algorithm-specification TrainingImage=<TRAINING_IMAGE>,TrainingInputMode=File \  --input-data-config <INPUT_DATA_CONFIG> \  --output-data-config S3OutputPath=<S3_OUTPUT_PATH> \  --resource-config InstanceType=<INSTANCE_TYPE>,InstanceCount=<INSTANCE_COUNT>,VolumeSizeInGB=<VOLUME_SIZE> \  --enable-network-isolation \  --stopping-condition MaxRuntimeInSeconds=<MAX_RUNTIME>

Backout Plan:

Using AWS Console:

  1. To disable network isolation, create a new training job without selecting Enable network isolation.

  2. Ensure that the new job has appropriate network access as needed.

Using AWS CLI:

  1. Create a new training job without the --enable-network-isolation flag:

    aws sagemaker create-training-job \  --training-job-name <JOB_NAME> \  --role-arn <IAM_ROLE_ARN> \  --algorithm-specification TrainingImage=<TRAINING_IMAGE>,TrainingInputMode=File \  --input-data-config <INPUT_DATA_CONFIG> \  --output-data-config S3OutputPath=<S3_OUTPUT_PATH> \  --resource-config InstanceType=<INSTANCE_TYPE>,InstanceCount=<INSTANCE_COUNT>,VolumeSizeInGB=<VOLUME_SIZE> \  --stopping-condition MaxRuntimeInSeconds=<MAX_RUNTIME>

Note: Network isolation is a per-job setting and cannot be modified after the job is created. To change it, you must create a new training job.

References:

CIS Controls Mapping:

VersionControl IDControl Description
7.114.4Use network segmentation and isolation controls.