Rules in Privacera Discovery¶

Rules in Privacera Discovery define how tags are assigned to data elements by combining logic from dictionaries and patterns. They provide the intelligence to determine what constitutes sensitive data and ensure consistent tagging across scans.

What Are Rules?¶

A rule specifies conditions under which a tag is applied. It links together one or more classification techniques such as dictionaries (based on content or column names) and/or models (heuristic and logic based) and defines how they should be interpreted.

Tags assigned via rules can then be used for classification, monitoring, and access control.

Rule Structure¶

Each rule can include the following:

Name: Unique identifier for the rule.
Description (optional): Explains the purpose of the rule.
Must Have: Required dictionary or pattern keys (e.g., m_EMAIL_KEYWORD, c_STATE_LOOKUP). The rule applies only if all these keys are present.
Must Not Have: Exclusion criteria. The rule is skipped if all these keys are present.
Score Type: Defines how the result is handled. Options include:
- REVIEW SCORE: Tag requires manual review.
- AUTO YES SCORE: Tag is auto-assigned without review.
- ACTUAL SCORE: Tag is determined based on calculated score.
Key for Samples: Feature key from which to retrieve sample values.
Output tags: The tag to apply when the rule matches
Enabled: Flag to activate or deactivate the rule.
Order: Execution priority (earlier rules take precedence)

Feature Key Naming Conventions¶

When a dictionary or model is created in Discovery, it is configured to operate either on metadata (e.g., column names) or on content (e.g., data values). Based on this selection, Discovery automatically generates a corresponding feature key:

m_ prefix: Indicates the feature is based on metadata and applies to column names.
c_ prefix: Indicates the feature is based on content and applies to actual data values.

For dictionaries, the key also includes a suffix that specifies the match type: - KEYWORD: Used when matching column names (metadata). - LOOKUP: Used when matching actual data values (content).

📌 Note: Models do not use suffixes in their generated feature keys. Only the m_ or c_ prefix is added based on their application mode.

Understanding these conventions is essential when building rules. To construct a rule, you must use the appropriate key based on the type of detection you intend to perform.

Example:¶

A dictionary named PERSON_NAME, when applied to metadata, results in a key: m_PERSON_NAME_KEYWORD.
A model named CC_MODEL, when applied to content, results in a key: c_CC_MODEL.

These keys are then referenced in the rule's Must Have or Must Not Have fields to achieve your classification objective. The keys specified are logically combined using AND condition to determine if the rule should be applied. To achieve OR condition, you can create multiple rules with the same tag.

Key Naming Examples¶

Type	Mode	Prefix	Suffix	Example Key
Dictionary	Metadata	`m_`	`KEYWORD`	`m_PERSON_NAME_KEYWORD`
Dictionary	Content	`c_`	`LOOKUP`	`c_STATE_LOOKUP`
Model	Metadata	`m_`	(none)	`m_CREDIT_CARD_MODEL`
Model	Content	`c_`	(none)	`c_CREDIT_CARD_MODEL`

Models in Rules¶

Models are heuristic and logic-based techniques used to identify structured data (e.g., credit cards, CVVs). To use a model in a rule:

Define the model under Discovery → Models.
Create a rule with the model key in the Must Have field.

Example:¶

Model: CREDIT_CARD_MODEL
Rule: Match if model CREDIT_CARD_MODEL identifies a credit card number in a column or file

Creating a Rule – Example Workflow¶

Step 1: Add a Dictionary¶

Go to: Discovery -> Dictionaries -> Add
Example: BANK_DICT for detecting terms like CVV
Type: LOOKUP for content, KEYWORD for headers

Step 2: Add a Pattern (if needed)¶

Go to: Discovery -> Patterns -> Add
Example: Pattern BANK_CVV with regex \b(\d{3})\b

Step 3: Add a Rule¶

Go to: Discovery -> Rules -> Add
Define name, tag, and add Must Have keys like m_BANK_DICT_KEYWORD or BANK_CVV
Set order to determine rule priority

Step 4: Run Scan¶

Upload data and run a scan.
Review the results under Data Inventory to see if the rule was applied correctly.

Rule Execution Priority¶

Rules are evaluated based on their defined order:

Rules at the top of the list are applied first.
When a rule matches all conditions, it applies its tag and stops further evaluation.
Use stricter, high-confidence rules earlier, followed by more general rules.
Use the Reorder Rules option to adjust execution order.

Use Cases for Rules¶

Tagging emails using both pattern match and keyword in header (EMAIL, EMAIL_ADDR)
Combining metaname and content-based lookups for multi-level detection
Custom rules for industry-specific tags (e.g., ACCOUNT_ID, DOB, NATIONAL_ID)

Conclusion¶

Rules are the decision-making layer in Privacera Discovery. They combine dictionaries, patterns, and logic to accurately tag data elements with relevant classifications. By managing rules effectively, organizations can ensure consistent, precise, and policy-ready data tagging.

Previous topic: Models
Next topic: Scanning Ways

Rules in Privacera Discovery¶

What Are Rules?¶

Rule Structure¶

Feature Key Naming Conventions¶

Example:¶

Key Naming Examples¶

Models in Rules¶

Example:¶

Creating a Rule – Example Workflow¶

Step 1: Add a Dictionary¶

Step 2: Add a Pattern (if needed)¶

Step 3: Add a Rule¶

Step 4: Run Scan¶

Rule Execution Priority¶

Use Cases for Rules¶

Conclusion¶

Comments