Skip to content

Tag-Based Row-Level Filtering

Tag-based Row Level Filter (RLF) policies in Apache Ranger allow you to create dynamic data filtering rules based on tags assigned to resources. This approach provides a flexible and scalable way to implement data access controls without having to create individual policies for each resource.

OMNI Requirement

Tag-based row-level filtering is currently only available for environments with OMNI enabled.

Overview

Tag-based RLF policies enable you to:

  • Define row filter rules once at the tag level instead of creating policies for each table.
  • Automatically apply filtering to all resources tagged with a specific tag.
  • Maintain centralized governance for row-level access controls.
  • Reduce policy management overhead by leveraging tag-based abstractions.

Pre-requisites

Before creating tag-based row-level filter policies, ensure the following requirements are met:

  1. Tag Service Association: The tag service (typically privacera_tag) must be associated with the appropriate resource service.

  2. Resource Tagging: The tag that will be configured in the policy must be associated with appropriate table resources.

  3. OMNI Enabled: Your Privacera environment must have OMNI enabled.

Supported Services

Tag-based row-level filtering is currently supported for the following data platforms:

Service Support Level OMNI Required
Snowflake 🟢 Yes 🟢 Yes
Databricks Unity Catalog 🟢 Yes 🟢 Yes

How It Works

sequenceDiagram
    participant Data Catalog
    participant Admin
    participant TagService
    participant Resource
    participant User
    participant DataPlatform

    Admin->>Data Catalog: Create tag (e.g., CUSTOMER_DATA)
    Admin->>Data Catalog: Tag table with CUSTOMER_DATA
    Admin->>TagService: Create tag-based RLF policy
    User->>DataPlatform: Query tagged table
    DataPlatform->>TagService: Evaluate tag-based RLF policy
    TagService->>User: Return filtered rows based on policy

Row Level Filter (RLF) Merging

Row Level Filter (RLF) merging automatically combines multiple data access rules for the same user and resource into a single rule. The merging logic selects OR or AND operators based on the columns referenced in each filter expression. This consolidation simplifies permission management and improves query performance.

Merging Logic

The RLF merging follows these rules:

  • Same Column → OR: If multiple RLF expressions reference the same column, they are merged using the OR operator.
  • Different Columns → AND: If RLF expressions reference different columns, they are merged using the AND operator.
  • Multiple Columns or Subquery → AND: If an RLF expression references multiple columns (for example, contains an internal AND) or uses a subquery or other complex expression on the left-hand side (LHS), it is merged using the AND operator.

Why RLF Merging Matters

Imagine you're a data analyst named "Sarah" who needs access to data in different ways:

Scenario 1: Same Column (OR Merging)

  • You can see data from the IT department.
  • You can also see data from the Finance department.

Without RLF merging, the system would have to check two separate rules every time you access data. With RLF merging, these rules are combined into one efficient rule that says "Sarah can see data from the IT department OR from the Finance department."

Scenario 2: Different Columns (AND Merging)

  • You can see data from the IT department.
  • You can also see data from the US region.

When the columns are different, RLF merging combines these rules with AND: "Sarah can see data from the IT department AND from the US region."

How RLF Merging Works

Example 1: Same Column Merging (OR)

Before Merging (Multiple Rules):

  • Rule 1: Sarah can access data where department = 'IT'
  • Rule 2: Sarah can access data where department = 'Finance'

After Merging (Single Combined Rule):

  • Combined Rule: Sarah can access data where (department = 'IT') OR (department = 'Finance')

Example 2: Different Columns Merging (AND)

Before Merging (Multiple Rules):

  • Rule 1: Sarah can access data where department = 'IT'
  • Rule 2: Sarah can access data where region = 'US'

After Merging (Single Combined Rule):

  • Combined Rule: Sarah can access data where (department = 'IT') AND (region = 'US')

Example 3: Multiple Columns or Subquery (AND)

Before Merging (Multiple Rules):

  • Rule 1: Sarah can access data where department = 'IT'
  • Rule 2: Sarah can access data where (region = 'US' AND status = 'active')

After Merging (Single Combined Rule):

  • Combined Rule: Sarah can access data where (department = 'IT') AND ((region = 'US' AND status = 'active'))

Example: Complex Merging Scenario

Let's say you're a financial analyst who needs to access customer data based on different criteria:

Scenario: You need to see:

  • Customers from your assigned region (US) - region = 'US'.
  • Customers from your department (Finance) - department = 'Finance'.
  • High-value customers (regardless of location) - customer_value = 'high'.

Before Merging:

  • System checks 3 separate rules every time you query data.
  • Slower performance due to multiple rule evaluations.
  • More complex to manage and troubleshoot.

After Merging:

Since all three expressions reference different columns (region, department, customer_value), they are merged with AND: * System checks 1 combined rule: (region = 'US') AND (department = 'Finance') AND (customer_value = 'high'). * Faster performance with single rule evaluation. * Easier to understand and manage.

Note: If multiple rules reference the same column (for example, region = 'US' and region = 'CA'), they are merged using the OR operator.

Benefits of RLF Merging

  1. Performance: Single rule evaluation instead of multiple checks
  2. Efficiency: Reduced computational overhead
  3. Manageability: Easier to understand combined access logic
  4. Scalability: Better handling of complex access scenarios

Components of a Tag-Based RLF Policy

Field Description
Tag The tag name (e.g., CUSTOMER_DATA, REGIONAL_DATA)
Row Filter Expression SQL condition using column names, constants, or macros (e.g., ABAC macros)
Policy Scope Users, groups, roles to which the policy applies
Conditions Optional policy conditions using macros like IS_IN_GROUP(...) or ABAC

Creating Tag-Based RLF Policies

Step 1: Create Tag-Based Row Filter Policy

  1. Go to Access Management > Tag Service
  2. Open the Row Filter Policies tab
  3. Click + Add Policy
  4. Select the tag (e.g., CUSTOMER_DATA)
  5. Define the Row Filter Expression:
    SQL
    region = 'US'
    
  6. Assign applicable users, groups, or roles
  7. Save the policy

Step 2: Tag the Resources

  1. Navigate to the Data Catalog section in Governance Hub
  2. Select the appropriate connector application
  3. Select the table you want to protect
  4. Associate the tag (e.g., CUSTOMER_DATA) with the table
  5. The tag-based RLF policy will now automatically apply to this table

Examples

Example 1: Filter by Static Region

Use Case: Users in the us_analyst group should only see rows where the region is 'US' for all tables tagged with REGIONAL_DATA.

Component Value
Tag REGIONAL_DATA
Row Filter Expression region = 'US'
Groups us_analyst

Result: Any table tagged with REGIONAL_DATA will automatically have this filter applied for members of the us_analyst group.


Example 2: Dynamic Filtering Using User Attributes

Use Case: Users should only see rows matching their own region based on their user attribute.

Component Value
Tag CUSTOMER_DATA
Row Filter Expression region = '${{USER.region}}'
Groups data_analysts

Result: Each user will see only the rows where the region column matches their USER.region attribute.


Example 3: Dynamic Filtering Using Role Attributes

Use Case: Users should only see rows matching their role's assigned region.

Component Value
Tag SALES_DATA
Row Filter Expression region = GET_UR_ATTR('region')
Groups sales_team

Result: Users will see rows based on the region attribute defined in their assigned role.


Example 4: Multi-Region Access with Group Membership

Use Case: Users can see rows from multiple regions based on their group memberships.

Component Value
Tag CUSTOMER_DATA
Row Filter Expression region in ('${{UG_NAMES_CSV.replace(",", "','")}}')
Groups public

Result: Users will see rows matching any of their group names (assuming groups correspond to regions).

Dynamic Tag Macros for RLF Policies

Tag-based RLF policies support powerful macros that help you create dynamic policies using tag information from child resources (like columns in a table). These macros make it easy to write policies that reference columns by their tags instead of hardcoding column names.

OMNI and MetadataTagEnricher Required

The TAG_GET_CHILD_RES and GET_CHILD_TAG_ATTR macros are specific to OMNI and are not supported by Apache Ranger itself. These macros can only be used in environments where OMNI is enabled with MetadataTagEnricher configured in the appropriate connectors.

TAG_GET_CHILD_RES Macro

The TAG_GET_CHILD_RES macro finds a column name by looking for a specific tag. Instead of writing the column name directly in your policy, you can ask "which column has this tag?" and use that column in your filter.

Multiple Tagged Columns

If multiple columns in a table have the same tag, the column will be picked in alphabetically ascending order.

Syntax:

Text Only
${{TAG_GET_CHILD_RES('tagName')}}

When to Use It:

  • You want to create reusable policies that work across multiple tables.
  • Column names might change, but their semantic meaning (tag) stays the same.
  • You need to reference a column based on its classification (PII, sensitive, geographic, etc.).

Example 1: Geographic Data Filtering

Business Need: Restrict data access based on the state column, but different tables might call it state, user_state, or location_state.

Setup:

YAML
Table: customer_data
Columns:
  - name
  - state (tagged with "STATE")
  - email
  - phone

Tag Assignment:
{
  "resourceName": "customer_data.state",
  "tagName": "STATE",
  "attributes": {
    "type": "geographic"
  }
}

Policy Expression:

SQL
${{TAG_GET_CHILD_RES('STATE')}} != 'TX'

What Happens:

  1. System looks for a column tagged with "STATE".
  2. Finds the state column.
  3. Replaces the macro with state.
  4. Final SQL filter: state != 'TX'.

Result: Only rows where the state is NOT Texas are visible to users covered by this policy.

Portability Across Tables

If the column was user_state or location_state in other tables, the expression would automatically become user_state != 'TX' or location_state != 'TX' respectively.


Example 2: Multi-Region Data Access

Business Need: Filter data based on the user's region using a region column.

Setup:

YAML
1
2
3
4
5
6
Table: transactions
Columns:
  - transaction_id
  - amount
  - region_code (tagged with "REGION")
  - timestamp

Policy Expression:

SQL
${{TAG_GET_CHILD_RES('REGION')}} IN ('US-WEST', 'US-EAST')

What Happens:

  1. Finds column tagged with "REGION" → region_code.
  2. Final filter: region_code IN ('US-WEST', 'US-EAST').

Result: Only transactions from specified regions are visible.


Example 3: Combining with User Attributes

Business Need: Show data where the customer's country matches the user's assigned country.

Setup:

YAML
1
2
3
4
5
Table: orders
Columns:
  - order_id
  - customer_country (tagged with "COUNTRY")
  - order_date

Policy RLF Expression:

SQL
${{TAG_GET_CHILD_RES('COUNTRY')}} = ${{GET_UG_ATTR('country')}}

What Happens:

  1. TAG_GET_CHILD_RES('COUNTRY')customer_country.
  2. GET_UG_ATTR('country') → Gets user's country attribute, e.g., 'US'.
  3. Final filter: customer_country = 'US'.

Result: Users only see orders from their assigned country.


GET_CHILD_TAG_ATTR Macro

The GET_CHILD_TAG_ATTR macro retrieves an attribute value from a child resource's tag. This lets you use tag configuration data (like consent queries, region codes, or threshold values) in your policy expressions.

Syntax:

Text Only
${{GET_CHILD_TAG_ATTR('tagName', 'attributeName')}}

When to Use It:

  • You need to access metadata stored in tag attributes.
  • You want to centralize configuration in tags rather than hardcoding in policies.
  • Different columns with the same tag type need different attribute values.

Business Need: Show PII data only for customers who have given consent, where the consent query is stored in the tag.

Setup:

YAML
Table: customer_profiles
Columns:
  - customer_id
  - ssn (tagged with "PII")

Tag Configuration:
{
  "tagName": "PII",
  "resource": "customer_profiles.ssn",
  "attributes": {
    "consent_query": "marketing_consent = true",
    "sensitivity": "high",
    "retention_days": "2555"
  }
}

Policy Expression:

SQL
(${{GET_CHILD_TAG_ATTR('PII', 'consent_query')}})

What Happens:

  1. Macro looks for tag "PII" in child resources.
  2. Finds the tag on the ssn column.
  3. Extracts the consent_query attribute value.
  4. Final filter: marketing_consent = true.

Result: Only rows where customers have given marketing consent are visible.


Example 2: Region-Based Filtering with Tag Attributes

Business Need: Filter data based on region code stored in the tag, allowing easy updates without policy changes.

Setup:

YAML
Table: financial_transactions
Columns:
  - transaction_id
  - amount
  - location (tagged with "REGION")

Tag Configuration:
{
  "tagName": "REGION",
  "resource": "financial_transactions.location",
  "attributes": {
    "allowed_regions": "US,CA,MX",
    "restricted_regions": "CN,RU",
    "compliance_level": "high"
  }
}

Policy Expression:

SQL
location IN (${{GET_CHILD_TAG_ATTR('REGION','allowed_regions').replace(/,/g, "','")}})

What Happens:

  1. Macro finds "REGION" tag on location column.
  2. Retrieves allowed_regions attribute → "US,CA,MX".
  3. Final filter: location IN ('US','CA','MX').

Result: Only transactions from allowed regions are visible.


Example 3: Department-Based Data Filtering

Business Need: Filter data based on department access rules stored in tag attributes.

Setup:

YAML
Table: employee_salaries
Columns:
  - employee_id
  - department_code (tagged with "DEPARTMENT")
  - salary

Tag Configuration:
{
  "tagName": "DEPARTMENT",
  "resource": "employee_salaries.department_code",
  "attributes": {
    "allowed_departments": "HR,FINANCE,EXEC",
    "restriction_level": "high",
    "approval_required": "true"
  }
}

Policy Expression:

SQL
department_code IN (${{GET_CHILD_TAG_ATTR('DEPARTMENT', 'allowed_departments').replace(/,/g, "','")}})

What Happens:

  1. Finds "DEPARTMENT" tag on department_code column.
  2. Retrieves allowed_departments"HR,FINANCE,EXEC".
  3. Final filter: department_code IN ('HR','FINANCE','EXEC').

Result: Only salary records for allowed departments are visible.


Example 4: Combining Multiple Tag Attributes

Business Need: Create complex filters using multiple attributes from the same tag.

Setup:

YAML
Table: medical_records
Columns:
  - patient_id
  - diagnosis_code (tagged with "MEDICAL_INFO")
  - treatment_date

Tag Configuration:
{
  "tagName": "MEDICAL_INFO",
  "resource": "medical_records.diagnosis_code",
  "attributes": {
    "min_clearance": "5",
    "allowed_departments": "CARDIOLOGY,NEUROLOGY",
    "consent_required": "true"
  }
}

Policy Expression:

SQL
${{GET_UR_ATTR('clearance_level')}} >= ${{GET_CHILD_TAG_ATTR('MEDICAL_INFO', 'min_clearance')}} 
AND department IN (${{GET_CHILD_TAG_ATTR('MEDICAL_INFO', 'allowed_departments').replace(/,/g, "','")}})

What Happens:

  1. First part: User's clearance level must be >= 5.
  2. Second part: User must be in CARDIOLOGY or NEUROLOGY department.
  3. Both conditions must be true for data access.

Result: Only users with sufficient clearance in allowed departments see the data.


Combining Both Macros

You can use both macros together for even more powerful policies!

Example: Dynamic Column with Dynamic Filtering

Business Need: Filter based on a column identified by tag, using filter criteria stored in another tag's attributes.

Setup:

YAML
Table: customer_data
Columns:
  - customer_id
  - membership_tier (tagged with "MEMBER_TIER")
  - consent_status (tagged with "CONSENT")

Tag Configuration:
{
  "tagName": "MEMBER_TIER",
  "resource": "customer_data.membership_tier",
  "attributes": {
    "allowed_tiers": "GOLD,PLATINUM,DIAMOND"
  }
}

Policy Expression:

SQL
${{TAG_GET_CHILD_RES('MEMBER_TIER')}} IN (${{GET_CHILD_TAG_ATTR('MEMBER_TIER', 'allowed_tiers').replace(/,/g, "','")}})

What Happens:

  1. TAG_GET_CHILD_RES('MEMBER_TIER')membership_tier.
  2. GET_CHILD_TAG_ATTR('MEMBER_TIER', 'allowed_tiers')"GOLD,PLATINUM,DIAMOND".
  3. Final filter: membership_tier IN ('GOLD','PLATINUM','DIAMOND').

Result: Fully dynamic policy that doesn't hardcode column names or filter values!


Key Benefits of Dynamic Tag Macros

TAG_GET_CHILD_RES Benefits:

  • Portability: Policies work across tables with different column names.
  • Maintainability: Change column names without updating policies.
  • Semantic Clarity: Policies express intent (e.g., "STATE") rather than implementation (e.g., "user_state_code").
  • Consistency: Same tag classification works everywhere.

GET_CHILD_TAG_ATTR Benefits:

  • Centralized Configuration: Store filter rules in tags, not policies.
  • Easy Updates: Change tag attributes without touching policies.
  • Reduced Duplication: Reuse tag attributes across multiple policies.
  • Dynamic Behavior: Policies adapt based on tag configuration.

Benefits of Tag-Based RLF

  1. Centralized Management: Define row filter rules once at the tag level instead of for each table.
  2. Automatic Enforcement: New tables tagged with the same tag automatically inherit the filtering rules.
  3. Reduced Policy Count: One tag-based policy can replace dozens or hundreds of resource-specific policies.
  4. Consistent Governance: Ensures consistent filtering rules across all resources with the same tag.

Combining with ABAC

Tag-based RLF policies work seamlessly with Attribute-Based Access Control (ABAC). You can use:

  • User Attributes: ${{USER.<attribute>}} - Filter based on user properties.
  • Role Attributes: GET_UR_ATTR('<attribute>') - Filter based on role properties.
  • Group Membership: UG_NAMES_CSV - Filter based on group memberships.
  • Tag Attributes: TAG.<attribute> - Filter based on tag metadata.

For more information on ABAC macros, see ABAC Macros.

Limitations

  • OMNI Required: Tag-based RLF is only available when OMNI is enabled in your environment.
  • Service Support: Currently limited to Snowflake and Databricks Unity Catalog.
  • Dynamic Tag Macros: The TAG_GET_CHILD_RES and GET_CHILD_TAG_ATTR macros are specific to OMNI and are not supported by Apache Ranger itself. These macros require MetadataTagEnricher to be configured in the appropriate connectors.

Troubleshooting

Policy Not Applied

  • Verify the tag service is associated with the resource service.
  • Confirm the table is correctly tagged.
  • Check that OMNI is enabled in your environment.
  • Ensure the user/group is within the policy scope.

Dynamic Macro Issues

TAG_GET_CHILD_RES not resolving:

  • Verify the tag is correctly associated with a column in the table.
  • Check that MetadataTagEnricher is configured in the connector.
  • If multiple columns have the same tag, remember the alphabetically first column will be selected.
  • Ensure the tag name in the macro matches exactly (case-sensitive).

GET_CHILD_TAG_ATTR returning empty:

  • Confirm the tag has the specified attribute configured.
  • Verify the attribute name spelling and case.
  • Check that the tag is associated with child resources (columns).
  • Ensure attribute values are properly formatted.