Profile Applicability:

  • Level 1

Description:

Amazon Macie is a fully managed data security and data privacy service that uses machine learning and natural language processing to automatically discover, classify, and protect sensitive data, such as PII (Personally Identifiable Information), stored in Amazon S3. Automated Sensitive Data Discovery in Macie allows the service to continuously scan your Amazon S3 buckets for sensitive data, such as credit card numbers, social security numbers, and other personally identifiable information, without manual intervention.

Enabling automated sensitive data discovery in Macie allows organizations to continuously monitor their data for sensitive information and comply with privacy regulations like GDPRCCPA, and HIPAA, among others.

Rationale:

Enabling automated sensitive data discovery with Macie is vital for:

  • Compliance: Ensures adherence to data protection laws and industry regulations by identifying and securing sensitive information.

  • Data Protection: Protects against the unauthorized exposure or misuse of sensitive data stored in Amazon S3.

  • Security: Detects and alerts on sensitive data in real time, allowing organizations to implement the necessary security controls.

  • Operational Efficiency: Automates the discovery of sensitive data, reducing the manual effort required to monitor and manage sensitive data within the organization.

Impact:

Pros:

  • Enhanced Data Security: Automates the detection and classification of sensitive data in Amazon S3, providing ongoing protection for PII and other critical data.

  • Regulatory Compliance: Helps organizations meet data privacy and compliance regulations, such as GDPR, CCPA, and HIPAA, by ensuring sensitive data is continuously discovered and classified.

  • Real-time Alerts: Enables organizations to receive alerts when sensitive data is found, allowing them to take immediate action to protect it.

  • Automation: Reduces the need for manual data classification efforts, saving time and resources.

Cons:

  • Cost: Enabling Macie can incur additional costs, as it charges based on the amount of data processed.

  • Complexity: While automation improves efficiency, it may also create challenges in tuning the service to avoid false positives or unnecessary alerts.

Default Value:

By default, automated sensitive data discovery is disabled in Amazon Macie. You must explicitly enable this feature for continuous data discovery.

Pre-requisite:

  • AWS IAM Permissions:

    • macie2:EnableMacie

    • macie2:UpdateS3JobDefinition

    • macie2:DescribeSensitiveData

  • AWS CLI installed and configured.

  • Amazon Macie must be enabled in the AWS account.

  • Amazon S3 buckets should be set up and contain data to be scanned.

Remediation

Test Plan:

Using AWS Console:

  1. Sign in to the AWS Management Console.

  2. Navigate to Amazon Macie under Services.

  3. In the Macie Dashboard, go to S3 Buckets to check the current configuration.

  4. Verify if automated sensitive data discovery is enabled:

    • Click on Sensitive Data Discovery Jobs.

    • Check if there are any active discovery jobs or scheduled jobs listed. If no jobs are listed or active, automated discovery is not enabled.

  5. If discovery is not enabled, click Create Sensitive Data Discovery Job to enable automated discovery for your S3 buckets.

  6. Set the job to automatically discover sensitive data by selecting the relevant options such as:

    • Job type: Automated discovery

    • S3 bucket selection: Select the S3 buckets for scanning

    • Frequency: Set to automatically scan based on your preferred schedule

Using AWS CLI:

To check if automated sensitive data discovery is enabled, run:

aws macie2 list-sensitivity-analytics-templates

To enable automated sensitive data discovery, use:

aws macie2 create-sensitivity-analytics-template \
--name <template-name> \
--s3-bucket-arn <bucket-arn> \
--data-criteria '{"type": "sensitiveData", "sensitiveDataCriteria": {"sensitiveDataCategory": ["PII"]}}' \
--job-type "Automated"

To verify that automated discovery is running, use:

aws macie2 describe-sensitivity-analytics-template --template-id <template-id>

Implementation Steps:

Using AWS Console:

  1. Log in to the AWS Management Console and navigate to Amazon Macie.

  2. In the Macie Dashboard, go to Discovery Jobs and click Create Discovery Job.

  3. Select Automated Discovery for the job type.

  4. Choose the S3 buckets that you want to scan for sensitive data.

  5. Configure the schedule for discovery (daily, weekly, etc.).

  6. Click Create Job to enable automated discovery for the selected buckets.

  7. Ensure that the job is running and configured correctly.

Using AWS CLI:

Create or update the sensitivity analytics template for automated data discovery:

aws macie2 create-sensitivity-analytics-template \
  --name "AutomatedSensitiveDataDiscovery" \
  --s3-bucket-arn "arn:aws:s3:::<bucket-name>" \
  --data-criteria '{"type": "sensitiveData", "sensitiveDataCriteria": {"sensitiveDataCategory": ["PII"]}}' \
  --job-type "Automated"

Confirm the creation of the discovery job:

aws macie2 describe-sensitivity-analytics-template --template-id <template-id>


Backout Plan:

If enabling automated sensitive data discovery causes issues, such as excessive costs or incorrect classifications:

  1. Identify the issue by reviewing the generated alerts and logs.

Delete or disable the sensitive data discovery job using:

aws macie2 delete-sensitivity-analytics-template --template-id <template-id>

  1. If required, re-enable discovery jobs with adjusted configurations (e.g., specific data criteria or fewer buckets).

  2. Ensure that the Macie service is functioning as expected after reverting the changes.

Note :

  • Data Sampling: Consider running discovery jobs in sampling mode initially to ensure that the right data is being classified before moving to full-scale automated discovery.

  • Review & Audit: Periodically review the discovered sensitive data and adjust data criteria as necessary to ensure the accuracy of the discovery process.

References:

CIS Controls Mapping:

Version

Control ID

Control Description

IG1

IG2

IG3

v8

3.4

Encrypt Data on End-User Devices – Ensure data encryption during file system access.

v8

6.7

Implement Application Layer Filtering and Content Control – Ensure appropriate content filtering is applied to sensitive files.

v8

6.8

Define and Maintain Role-Based Access Control – Implement and manage role-based access for file systems.

v8

14.6

Protect Information Through Access Control Lists – Apply strict access control to file systems.