Description:

Syncing users and groups from Microsoft Entra ID (formerly known as Azure Active Directory) to Azure Databricks ensures that access control in Databricks is consistent with the identities and roles defined in the organization's Entra ID environment. This process allows Databricks to authenticate users and apply appropriate access control based on their group membership and role.

Rationale:

Syncing users and groups from Microsoft Entra ID to Azure Databricks provides centralized identity management and simplifies user access and group management. It also ensures that access policies and permissions in Databricks are automatically updated when changes are made in Microsoft Entra ID, reducing administrative overhead and improving security by leveraging existing identity and access management (IAM) systems.

Impact:

Enabling synchronization of users and groups from Entra ID to Azure Databricks ensures that Databricks inherits the same user roles, permissions, and access control mechanisms used in Entra ID. This helps to streamline user management processes but requires proper configuration of both Entra ID and Databricks to ensure seamless synchronization. Any issues with the configuration may result in delayed or failed user synchronization.

Default Value:

By default, users and groups are not synced from Microsoft Entra ID to Azure Databricks. This needs to be manually configured to enable synchronization.

Pre-requisites:

  • Azure account with access to both Microsoft Entra ID and Azure Databricks.

  • Admin access to both Microsoft Entra ID and Azure Databricks.

  • Azure Active Directory (AAD) sync is configured for your Databricks workspace.

  • The user must have appropriate permissions to configure user synchronization (e.g., Databricks Admin, Global Admin).

Audit:

  1. Sign in to Azure portal as a Databricks Admin or Global Admin.

  2. Navigate to the Azure Databricks workspace.

  3. Review the User Management section in the Azure Databricks Admin Console to ensure that users and groups are synced from Microsoft Entra ID.

  4. Verify that users and groups are correctly populated in Databricks and match the ones in Microsoft Entra ID.

Implementation Steps:

  1. Sign in to the Azure portal with Databricks Admin or Global Admin credentials.

  2. Enable Azure Active Directory (AAD) integration for the Databricks workspace:

    • In the Azure portal, navigate to your Azure Databricks workspace.

    • Under Workspace Settings, locate the Identity and Access Management section.

    • Select Azure Active Directory to configure Entra ID synchronization.

  3. Enable the synchronization of users and groups:

    • In the Azure Databricks workspace, go to the Admin Console.

    • Navigate to the User Management section and select Sync with Azure AD.

    • Choose the option to sync users and groups from Microsoft Entra ID to Databricks.

    • Specify which groups and users from Entra ID should have access to the Databricks workspace. You may also specify which groups should be assigned certain roles within Databricks.

  4. Test the synchronization:

    • After completing the setup, verify that users and groups from Entra ID are correctly synced in the Databricks workspace.

    • Ensure that users can log in to Databricks using their Entra ID credentials.

    • Verify that group memberships and roles are correctly reflected in Databricks.

Backout Plan:

  1. Sign in to the Azure portal as a Databricks Admin or Global Admin.

  2. Navigate to the Azure Databricks workspace and go to the Admin Console.

  3. Disable Azure Active Directory (AAD) integration or disconnect user synchronization:

    • In the User Management section, stop the synchronization process by selecting Stop syncing users and groups.

  4. Revert any group or user roles that were assigned during synchronization if necessary.

  5. Test the login functionality to ensure that users can still access Databricks, but synchronization is disabled.

References: