Profile Applicability:
- Level 1
Description:
Amazon SageMaker provides fully managed endpoints for deploying machine learning models to production environments. A production variant is a component of an endpoint that serves model predictions. To ensure high availability and fault tolerance, it is recommended that each SageMaker Endpoint Production Variant has at least two initial instances running. This ensures that if one instance becomes unavailable, the second instance can handle the load, maintaining uptime and service reliability.
Rationale:
High Availability: Running at least two instances for each production variant ensures that the endpoint remains available in case of an instance failure, minimizing downtime.
Scalability: Having multiple instances allows the system to handle more traffic, providing better performance during periods of high load or demand.
Reliability: Multiple instances provide redundancy, ensuring that the endpoint can continue serving requests even if one instance fails.
Impact:
Pros:
Improved Availability: Two instances ensure the service remains operational if one fails.
Increased Redundancy: Protects against potential failures of a single instance.
Scalability: Increases capacity to handle high traffic or large model requests.
Resilience: Helps meet service-level agreements (SLAs) by ensuring model predictions are always available.
Cons:
Cost: Running two instances increases the cost of deployment, as more compute resources are utilized.
Configuration Complexity: Requires monitoring and managing the scaling of resources to ensure that the correct number of instances are always running.
Default Value:
By default, SageMaker endpoints may start with a single instance in the production variant. Configuring multiple instances requires explicit setup during endpoint creation.
Pre-requisite:
AWS IAM Permissions:
sagemaker:DescribeEndpoint
sagemaker:CreateEndpoint
sagemaker:UpdateEndpoint
AWS CLI installed and configured.
SageMaker Endpoint is created and operational.
Desired instance type and count for production variants should be known.
Test Plan:
Using AWS Console:
Sign in to the AWS Management Console.
Navigate to Amazon SageMaker under Services.
In the SageMaker Dashboard, go to Endpoints.
Select the Endpoint that you want to check.
Review the Production Variant settings:
Ensure that the InitialInstanceCount is set to 2 or more for each production variant.
If the number of instances is less than two, proceed with updating the endpoint as described in the Implementation Steps.
Using AWS CLI:
To describe the SageMaker Endpoint and check the instance count for production variants, run:
aws sagemaker describe-endpoint --endpoint-name <endpoint-name> --query 'EndpointConfig.ProductionVariants'
Review the output to ensure that each ProductionVariant has an InitialInstanceCount of 2 or more. If it’s less than two, follow the steps in the Implementation Steps to update it.
Implementation Steps:
Using AWS Console:
Sign in to the AWS Management Console and navigate to Amazon SageMaker.
In the SageMaker Dashboard, go to Endpoints and choose the Endpoint to modify.
Under Production variants, find the variant you want to modify.
Edit the InitialInstanceCount field to 2 or higher.
Save the changes and update the endpoint.
Using AWS CLI:
To create a SageMaker Endpoint with two instances in the production variant, run the following command:
aws sagemaker create-endpoint \ --endpoint-name <endpoint-name> \ --endpoint-config-name <endpoint-config-name>
If the endpoint already exists, you can update it to increase the number of instances for the production variant:
aws sagemaker update-endpoint \ --endpoint-name <endpoint-name> \ --endpoint-config-name <new-endpoint-config-name>
In the new endpoint configuration, set the InitialInstanceCount for each production variant to 2 or more:
aws sagemaker create-endpoint-config \ --endpoint-config-name <new-endpoint-config-name> \ --production-variants VariantName=<variant-name>,ModelName=<model-name>,InstanceType=<instance-type>,InitialInstanceCount=2
Verify the changes by running:
aws sagemaker describe-endpoint --endpoint-name <endpoint-name> --query 'EndpointConfig.ProductionVariants'
Backout Plan:
Using AWS Console:
If configuring multiple instances causes issues or increased costs, sign in to the AWS Management Console.
Navigate to Amazon SageMaker, select the endpoint, and go to Edit.
Reduce the initial instance count to one (or any other appropriate number of instances)
Save the changes and ensure that the endpoint is updated with the reduced number of instances.
Using AWS CLI:
To revert the initial instance count to one, run the following command:
aws sagemaker update-endpoint --endpoint-name <ENDPOINT_NAME> --endpoint-config-name <ENDPOINT_CONFIG_NAME> --region <REGION>
Verify the change by describing the endpoint
aws sagemaker describe-endpoint --endpoint-name <ENDPOINT_NAME> --region <REGION>