Profile Applicability:

  • Level 1

Description:

AWS Glue is a fully managed ETL (Extract, Transform, Load) service that helps you prepare and load data for analytics. The AWS Glue Data Catalog serves as a centralized repository for all your data assets, including metadata for datasets in Amazon S3 and other services.

It's important to ensure that AWS Glue Data Catalogs are not publicly accessible, meaning only authorized users and services within your AWS account (or across authorized accounts) should have access. By default, the Data Catalog is private, but configuration errors or mismanagement of IAM policies or access control settings could lead to inadvertent public exposure.

This SOP ensures that the Glue Data Catalog is securely accessible and not exposed to the public internet or unauthorized users.

Rationale:

Ensuring that AWS Glue Data Catalogs are not publicly accessible is critical for:

  • Data Security: Prevents unauthorized access to metadata, which may include sensitive information about your data (e.g., data schema, table names, etc.).

  • Compliance: Helps meet regulatory compliance requirements such as PCI-DSS, HIPAA, and SOC 2, which require data access to be controlled and restricted.

  • Best Practices: Aligns with security best practices for data management by ensuring access is restricted to authorized resources only.

Impact:

Pros:

  • Enhanced Security: Prevents unauthorized users or services from accessing your Glue Data Catalog, protecting sensitive metadata.

  • Compliance: Supports compliance with data protection laws and standards.

  • Access Control: Allows fine-grained control over who can access the catalog and its metadata.

Cons:

  • Operational Overhead: Requires continuous monitoring and management of IAM roles and policies to ensure correct access configurations.

  • Complexity: Misconfiguration in IAM policies or ACLs could lead to unintended access restrictions or service disruptions.

Default Value:

By default, AWS Glue Data Catalogs are private, and access is controlled by IAM roles and policies. However, misconfigurations or overly permissive IAM policies can lead to unintended public access.

Pre-requisite:

  • AWS IAM Permissions:

    • glue:DescribeDataCatalog

    • glue:UpdateDataCatalog

    • iam:ListPolicies

  • AWS CLI installed and configured.

  • Basic knowledge of AWS Glue, IAM permissions, and resource access controls.

Remediation:

Test Plan:

Using AWS Console:

  1. Sign in to the AWS Management Console.

  2. Navigate to AWS Glue under Services.

  3. In the AWS Glue Dashboard, go to Data Catalog settings.

  4. Ensure that the Data Catalog access is restricted by checking the IAM policies and access control settings.

    • The Glue Data Catalog should not have overly permissive IAM roles or policies attached.

    • Ensure only authorized IAM users/roles within your account (or cross-account) have access.

Using AWS CLI:

To describe the Glue Data Catalog and check the access settings, run:

aws glue get-data-catalog-encryption-settings --query 'DataCatalogEncryptionSettings'

  1. Verify that the EncryptionMode is correctly set (e.g., SSE-KMS or DISABLED), and ensure that no overly permissive policies are allowing unauthorized access.

To check IAM policies attached to the Glue Data Catalog, run:

aws iam list-policies --query 'Policies[*].{PolicyName:PolicyName, Arn:Arn}'

  1. Ensure that the IAM policies associated with the Data Catalog enforce the principle of least privilege and do not grant public access.

Implementation Steps:

Using AWS Console:

  1. Sign in to the AWS Management Console and navigate to AWS Glue.

  2. In the AWS Glue Dashboard, select the Data Catalog and check the access control settings.

  3. Review the IAM roles and policies associated with the Data Catalog.

    • Ensure that only necessary users or services have access to the catalog and that public access is explicitly restricted.

    • If necessary, update the IAM policy to restrict access further (e.g., by removing wildcard (*) permissions such as glue:DescribeDataCatalog or glue:GetDataCatalog from public roles).

  4. Ensure that encryption (e.g., SSE-KMS) is enabled for the Glue Data Catalog to ensure secure access.

Using AWS CLI:

To ensure public access is disabled, verify that the Data Catalog encryption settings are in place:

aws glue update-data-catalog-encryption-settings \

  --catalog-id <catalog-id> \

  --data-catalog-encryption-settings '{"EncryptionMode": "SSE-KMS", "KmsKeyId": "<kms-key-id>"}'

Check and modify IAM policies attached to the Data Catalog to ensure that only authorized entities can access it, using:

aws iam attach-role-policy --role-name <role-name> --policy-arn <policy-arn>

  1. Confirm that no public access (e.g., from 0.0.0.0/0 or ::/0) is allowed by the policies.

Backout Plan:

If restricting access to the Glue Data Catalog causes issues:

  1. Identify the affected IAM roles and policies.

  2. Revert any changes made to restrict access by:

    • Updating the IAM policy to allow access.

    • Reverting to a more permissive encryption or access configuration, if necessary.

  3. Ensure that proper access control is re-established without exposing the Data Catalog publicly.

Note:

  • Cross-Account Access: If cross-account access to the Glue Data Catalog is required, ensure that the IAM policies properly limit permissions to the required accounts and resources, avoiding public access.

  • KMS Key Management: If using SSE-KMS, ensure the KMS key used is securely managed, rotated, and only accessible by authorized users.

References:

CIS Controls Mapping:

Version

Control ID

Control Description

IG1

IG2

IG3

v8

3.4

Encrypt Data on End-User Devices – Ensure data encryption during file system access.

v8

6.7

Implement Application Layer Filtering and Content Control – Ensure appropriate content filtering is applied to sensitive files.

v8

6.8

Define and Maintain Role-Based Access Control – Implement and manage role-based access for file systems.

v8

14.6

Protect Information Through Access Control Lists – Apply strict access control to file systems.