Manage Foreign GDC Catalog¶
Prerequisite¶
This guide walks you through the steps to set up HMS Federation with AWS Glue in Unity Catalog. By completing this setup, you can query metadata from an external AWS Glue Data Catalog as if it were natively registered within Unity Catalog.
DBX Doc reference: Databricks Glue HMS Federation Documentation
Limitations¶
- The connector supports only the JDBC connection method for managing permissions in Unity Catalog.
- The connector supports only native masking and row-level filters (RLF) for applying column masking and RLF policies.
- Databricks Unity Catalog provides read-only access to the federated Glue Data Catalog (GDC).
- The name of a foreign catalog must not exceed 236 characters.
Setup¶
Note
- The connector automatically detects whether a catalog is foreign or regular when loading resources. No additional configuration is required to specify the catalog type.
Warning
- Values are case-sensitive.
- Use fully qualified names for schemas, tables/views, and functions (e.g., catalog1.schema1.*).
- Replace all example values with your actual resource names.
-
SSH to the instance where Privacera Manager is installed.
-
Run the following command to open the
.yml
file to be edited.If you have multiple connectors, then replace
instance1
with the appropriate connector instance name.Bash -
Set the following properties to enable the connector to manage the permissions for foreign catalogs, schemas and tables in the Databricks Unity Catalog:
-
Once the properties are configured, run the following commands to update your Privacera Manager platform instance:
Step 1 - Setup which generates the helm charts. This step usually takes few minutes.
Step 2 - Apply the Privacera Manager helm charts. Step 3 - Post-installation step which generates Plugin tar ball, updates Route 53 DNS and so on.
Note
- GRANT statements for write operations (e.g., CREATE TABLE, CREATE SCHEMA) can be applied to foreign catalog resources; however, they will have no effect, as Databricks does not permit write operations on foreign catalogs.
Native Masking and RLF¶
The connector supports only native masking and row-level filtering (RLF) for enforcing column masking and RLF policies on foreign catalog tables.
- To enforce these policies:
- Privacera creates functions and attaches them to the relevant columns (for masking) and tables (for row filters).
- These functions are stored in a new schema within a catalog that is local to the Databricks workspace.
- Since GDC foreign catalogs do not support the creation of new schemas, the connector automatically creates a local catalog in the Databricks workspace.
- A corresponding schema, matching the name of the original schema in the foreign catalog, will be created in the local catalog to store these functions.
Local Catalog Naming Convention¶
- The local catalog name is automatically generated using a predefined prefix followed by the name of the foreign catalog.
- By default, this prefix is set to
privacera_security_
to ensure the local catalog is easily identifiable. - Example: If the foreign catalog name is
test_glue_catalog
, the corresponding local catalog will be namedprivacera_security_test_glue_catalog
- Prev topic: Advanced Configuration