Prerequisites for Collibra¶
Before setting up the Collibra connector, ensure that the following prerequisites are met.
Prerequisites¶
| Prerequisite | Details |
|---|---|
| Collibra URL | The base URL of your Collibra instance, for example https://your-org.collibra.com. The connector calls the Collibra REST API 2.0 (/rest/2.0/...). |
| Collibra Service Account | A Collibra user account with read access to the assets, relations, and tags you want to sync. The connector authenticates to Collibra using HTTP Basic authentication (username and password). A read-only account is sufficient — the connector never writes back to Collibra. |
| Network Connectivity | The Collibra instance must be reachable over HTTPS from the Privacera Platform hosts where the connector runs. |
| Connection(s) to sync | Identify the Collibra connection name(s) under which your data assets are cataloged (for example sbtConnection). The connector walks the asset hierarchy starting from each connection. These are configured in collibra.connection.to.service.mapping. |
| Engine for each connection | Decide the default engine (hive, trino, or snowflake) for each connection. This determines the resource hierarchy used when mapping a Collibra asset to a Ranger resource. |
| Ranger tag services | The target Ranger tag service for each engine must already exist (for example privacera_hive, privacera_trino, privacera_snowflake). These are configured in collibra.ranger.service.mapping. |
| Access connectors for policies | The Collibra connector only syncs tags. To enforce access using those tags, the corresponding access connectors (Hive, Trino, Snowflake, etc.) must be configured separately, and tag-based policies created in Privacera. |
| OMNI (optional) | To also sync tags to OMNI Metadata Service (MDS), enable and configure OMNI for PolicySync connectors. See Configuring OMNI for PolicySync Connectors. OMNI is not required for the Ranger tag sync path. |
Collibra asset model expectations¶
The connector discovers assets by walking Collibra relations and asset names. For best results:
- Each connection's assets should follow the standard catalog hierarchy for the engine:
- Hive — database → table → column
- Trino — catalog → schema → table → column
- Snowflake — database → schema → table → column
- Tags (Collibra tags/labels) are read for every asset in the hierarchy. Assets without tags are simply skipped.
- The connection asset type defaults to
Technology Asset. If your Collibra model uses a different asset type for connections, setcollibra.connection.asset.type.nameaccordingly (see Advanced Configuration).