Profile Applicability:

  • Level 1

Description:

Disaster Recovery Service (DRS) is a managed service that automates the process of disaster recovery for your IT infrastructure, ensuring business continuity by facilitating quick recovery of applications, databases, and workloads. When DRS is enabled, you can define disaster recovery plans that can be triggered in the event of a failure. The service allows you to define recovery jobs to automate the recovery process, minimizing downtime and ensuring applications are restored quickly. This SOP ensures that DRS is not only enabled but also that recovery jobs are set up to automate disaster recovery.

Rationale:

Enabling Disaster Recovery Service (DRS) with recovery jobs ensures that:

  • Business Continuity: Automatically recovers workloads in case of failure, ensuring minimal downtime and avoiding manual recovery steps.

  • Automated Recovery: Recovery jobs are predefined and can be triggered automatically during a disaster, speeding up recovery processes.

  • Compliance: DRS is often a requirement for disaster recovery and business continuity plans (BCPs), particularly in regulated industries.

  • Efficiency: Automating the recovery process reduces human errors and ensures that recovery processes are executed consistently.

Impact:

Pros:

  • Faster Recovery: Automated recovery reduces downtime and speeds up the recovery process during a disaster.

  • Reduced Manual Intervention: Recovery jobs automate the entire process, ensuring the recovery process is carried out with minimal human intervention.

  • Improved Compliance: Helps ensure that disaster recovery processes are in place and meet regulatory compliance requirements.

  • Business Continuity: Ensures critical applications are available with minimal disruption during failures or disasters.

Cons:

  • Initial Configuration Effort: Setting up disaster recovery plans and defining recovery jobs requires some initial configuration effort.

  • Costs: There may be additional costs associated with running DRS jobs, including data replication and storage costs.

Default Value:

By default, DRS is not enabled. It must be explicitly configured and recovery jobs must be created to automate disaster recovery tasks.

Pre-requisites:

  • IAM Permissions:

    • drs:DescribeJobs

    • drs:CreateJob

    • drs:UpdateJob

    • drs:DeleteJob

  • Disaster Recovery Setup: Ensure that the disaster recovery service has been configured for your account and is operational.

  • Backup and Replication: Set up and configure the backup and replication services for the resources to be included in the disaster recovery plan.

Remediation:

Test plan:

Using AWS Console:

  1. Sign in to the AWS Management Console.

  2. Navigate to Disaster Recovery Service (DRS) under Services.

  3. In the DRS Console, select Disaster Recovery Jobs.

  4. To check if any jobs are already created, review the list of available recovery jobs.

  5. If no jobs are configured, click Create Job to start setting up a new recovery job.

  6. Define the recovery plan, selecting the resources (e.g., EC2 instances, databases) to include.

  7. Set the job to run automatically when needed, ensuring the job frequency is set to the desired recovery schedule.

  8. Save and confirm that the job is enabled.

Using AWS CLI:

To list current disaster recovery jobs, run:

aws drs describe-jobs

To create a new disaster recovery job, run:

aws drs create-job --job-name <job-name> --job-type <job-type> --resource-ids <resource-id> --other-options

To verify the job's status, run:

aws drs describe-job --job-id <job-id>

Implementation Steps:

Using AWS Console:

  1. Sign in to the AWS Management Console and navigate to Disaster Recovery Service (DRS).

  2. Go to Disaster Recovery Jobs and select Create Job.

  3. Define the job name, the type of recovery job (e.g., full recovery, partial recovery), and the resources to include in the recovery process.

  4. Set the job schedule to ensure it runs automatically and defines the timing for triggering recovery jobs.

  5. Save the job settings to complete the configuration.

Using AWS CLI:

To create a disaster recovery job, run:

aws drs create-job --job-name <job-name> --job-type <job-type> --resource-ids <resource-id> --other-options

To verify the status of the job, use:

aws drs describe-job --job-id <job-id>

Backout Plan:

Using AWS Console:

  1. If the disaster recovery job configuration causes issues (e.g., resource contention, incorrect configurations), navigate to Disaster Recovery Jobs.

  2. Select the job you want to remove and click Delete to remove the job from the environment.

Using AWS CLI:

To delete a recovery job, run:

aws drs delete-job --job-id <job-id>

Verify that the job is deleted by running:

aws drs describe-jobs

Note:

  • Monitoring: Once DRS and recovery jobs are enabled, consider setting up CloudWatch alarms to monitor the status of recovery jobs and ensure they are executed correctly.

  • Job Frequency: Review and adjust the frequency of recovery jobs based on the criticality of the resources and the required recovery point objective (RPO).

  • Testing: Periodically test disaster recovery procedures to ensure they function correctly in the event of an actual disaster.

References:

CIS Controls Mapping:

Version

Control ID

Control Description

IG1

IG2

IG3

v8

11.4

Implement and test disaster recovery plans to ensure business continuity in the event of an outage.

v8

11.5

Use automation to ensure that recovery operations can be performed efficiently in a disaster.

v8

11.6

Ensure disaster recovery procedures are consistent, tested, and updated regularly.