Ensure All Data in BigQuery is Classified : iCompaas Support

Profile Applicability

Level 2

Description:

BigQuery tables may store sensitive data that requires classification for security and compliance purposes. Google's Sensitive Data Protection tools can be used to automatically discover, classify, and protect data within BigQuery across an organization.

Rationale:

Classifying data in BigQuery is crucial for managing and protecting sensitive information. Leveraging tools like Google Cloud's Sensitive Data Protection, which employs machine learning and pattern matching, automates the discovery and classification of sensitive data, ensuring robust data governance and reducing the risk of accidental exposure.

Impact:

Cost: Implementing Google Cloud's Sensitive Data Protection or third-party tools incurs additional costs.
Resource Management: Continuous monitoring and classification require regular configuration and oversight.

Default Value:

By default, BigQuery data is not classified unless manually configured.

Audit Steps:

Using Google Cloud Console

Navigate to Cloud DLP.

Confirm the presence of a discovery scan configuration for either the organization or project.

Remediation Steps:

Enable Data Profiling

Access Cloud DLP Configurations.

Click Create Configuration.

Set up data profiling:
- For projects, refer to Profiling Projects.
- For organizations or folders, refer to Profiling Organizations or Folders.

Review Findings:

Identify columns or tables with high data risk that contain sensitive data without proper protections.
Mitigation options:
- Apply BigQuery policy tags to restrict access to specific roles.
- Use de-identification techniques, such as masking or tokenization, to protect sensitive data.

Integrate Findings Into Security Operations:

Publish data profiles to services such as:
- Security Command Center
- Chronicle
Use Pub/Sub for automating remediation or alerting about new or altered data risks.

Backout Plan:

Step 1: Disable Data Profiling & Classification Scans

If classification scans cause disruptions, disable Cloud DLP scans

gcloud dlp jobs delete JOB_ID

Replace JOB_ID with the actual Data Profiling job ID.

Step 2: Remove Data Classification Labels

If classification is incorrectly applied, remove dataset labels:

bq update --clear_labels PROJECT_ID:DATASET_NAME

Replace PROJECT_ID and DATASET_NAME accordingly.

Step 3: Notify Stakeholders

Inform data security teams before making classification changes.

References:

CIS Controls:

Control	Description	IG 1	IG 2	IG 3
3.1 Establish and Maintain a Data Management Process	Define and maintain processes for managing data, including sensitivity classification, ownership, handling, retention, and disposal. Review annually or as needed.	✅	✅	✅
3.7 Establish and Maintain a Data Classification Scheme	Create and maintain a data classification system with appropriate labels, such as "Public," "Sensitive," and "Confidential." Review and update as required.		✅	✅
5.1 Establish Secure Configurations	Maintain documented and standardized security configurations for all authorized systems and software.	✅	✅	✅

7.4 Ensure All Data in BigQuery Is Classified (Manual) Print