Enabling for Realtime Discovery¶
Discovery supports Realtime discovery to monitor and scan data in real-time. For enabling Realtime discovery, there are a few prerequisites and configurations that you need to set up.
Prerequisites¶
Prerequisite | Description |
---|---|
Setting up PKafka Service | This service listens to the messaging queue for audit events. The configuration for each cloud differs slightly and is outlined in the Setup section. |
Even though the service name is called PKafka, it supports multiple messaging services like AWS SQS, Azure Event Hub, and GCP Pub/Sub
Each cloud provider requires additional prerequisites and configurations. Follow the steps based on the cloud provider.
To configure PKafka with AWS, you need to set up an Amazon SQS queue and an IAM role. These steps are covered below in the section for installing the base Privacera Discovery service. Refer to the Prerequisites -> AWS section.
Prerequisite | Description |
---|---|
AWS SQS Queue | Name of the AWS SQS to fetch the change events for AWS S3 and DynamoDB |
AWS IAM Role | ARN of the AWS IAM Role which has permissions to the SQS Queue. E.g. privacera-discovery-role-privacera-prod |
For configuring PKafka with Azure, you need to set up Event HUB. These steps are covered in the section for installing the base Privacera Discovery service. Refer to the Prerequisites -> Azure section.
Prerequisite | Description |
---|---|
Create an Event Hub namespace, Event Hub and Consumer Group | Used for real-time scanning, capturing change events for resources and used for parallel processing of events. Connection String from Event Hub used to connect resource with the Azure Event Hub |
Create an Event Subscription | Defines how events are routed from a source to a target. |
For configuring PKafka with GCP, you need to set up Google Looging Sink. This steps are covered in the section for installing the base Privacera Discovery service. Refer to the Prerequisites -> GCP section.
Prerequisite | Description |
---|---|
Create Google Logging Sink | Create Google Logging Sink to receive the logs from the GCP resources. |
Create PubSub topic | Create a pubsub topic to receive the logs from the Google Logging Sink. |
Setup¶
Copy the vars.pkafka.aws.yml
from config/sample-vars
to config/custom-vars
and edit the file.
Bash | |
---|---|
Replace the following placeholders
PKAFKA_SQS_ENDPOINT: Amazon SQS Queue name URL. It would have this format, where DEPLOYMENT_ENV_NAME is the name of the deployment environment .e.g privacera-prod
: https://sqs.<AWS_REGION>.amazonaws.com/<ACCOUNT_ID>/privacera_bucket_sqs_<DEPLOYMENT_ENV_NAME>
PKAFKA_IAM_ROLE_ARN: ARN of the IAM role created for Privacera Discovery Service. E.g. arn:aws:iam::<ACCOUNT_ID>:role/privacera-discovery-role-privacera-prod
Add or edit the following variables:
Bash | |
---|---|
Follow these steps to configure the AWS SQS queue for real-time scanning¶
Copy the vars.pkafka.azure.yml
from config/sample-vars
to config/custom-vars
and edit the file.
Bash | |
---|---|
Add or edit the following variables:
Bash | |
---|---|
Replace the following placeholders
These values you can refer from Prerequisites -> Azure section.
PKAFKA_EVENT_HUB: Name of the event hub created for realtime scanning to receive change events (such as object creation, deletion, or modification) from ADLS via Event Grid.(Example: discovery-eventhub
)
PKAFKA_EVENT_HUB_NAMESPACE: Event hub namespace created for realtime scanning.(Example: discovery-eventhub-namespace
)
PKAFKA_EVENT_HUB_CONSUMER_GROUP: You can provide $Default
as a value, if planning to use default consumer group created by event hub else you can provide newly created consumer group unique name.
PKAFKA_EVENT_HUB_CONNECTION_STRING: Primary connection string value to access event hub RootManageSharedAccessKey.
Copy the vars.pkafka.gcp.yml
from config/sample-vars
to config/custom-vars
and edit the file.
Bash | |
---|---|
Add or edit the following variables:
Bash | |
---|---|
Enable Realtime Discovery¶
- Log in to Privacera:
- For Self-Managed, log in to the Privacera Portal.
- For Data Plane, log in to the Privacera Discovery Admin Console.
- Navigate to Settings > Data Source Registration.
-
Edit the application for which you want to enable Realtime discovery.
Note
Realtime discovery is supported only for AWS S3, Google BigQuery, Google Cloud Storage, and Azure Data Lake Storage.
-
Select Application Properties.
- Turn on Toggle Enable Real-Time.
- Click Save.
Restart Privacera Services¶
- Prev Advanced Configuration
- Next Troubleshooting