Skip to main content

Privacera Documentation

Google Sink to Pub/Sub

This topic covers how to use a Sink based approach to read the real time audit logs for real time scanning in Pkafka for Discovery, instead of using the Cloud logging API. The following are key advantages of Sink based approach:

  • All the logs will be synchronized to a Sink.

  • Sinks are exported to a destination Pub/Sub topic.

  • Pkafka subscribes to the Pub/Sub topic and it will read the audit data from the topic and will pass on the Privacera topic and a real time scan will be triggered.

You need to create following resources on Google Cloud Console:

  1. Destination to write logs from Sink: Following destination are available to write logs from Sink:

    a. Cloud Storage

    b. Pub/Sub Topic

    c. Big Query

    In this document, Pub/Sub Topic is considered as a destination for a Sink.

  2. Create a Sink

Create Pub/Sub topic

  1. Log on to Google Cloud Console and navigate to Pub/Sub topics page.

  2. Click the + CREATE TOPIC.

  3. In the Create a topic dialog, enter the following details:

    • Enter the unique topic name in the Topic ID field. For example, DiscoverySinkTopic.

    • Select Add a default subscription checkbox.

  4. Click CREATE TOPIC.


    If required, you can create a subscription in a later stage, after creating the topic, by navigating to Topic > Create Subscription > Create a simple subscription.

    Note down the subscription name as it will be used inside a property in Discovery.

  5. If you created a default subscription, or created a new subscription, you need to change the following properties:

    • Acknowledgement deadline: Set as 600.

    • Retry policy: Select as Retry after exponential backoff delay and enter the following values:

      • Minimum backoff(seconds): 10

      • Maximum backoff (seconds): 600

  6. Click Update.


    You can configure GCS lineage time using custom properties, that are not readly apparent by default. See Set custom Discovery properties on Privacera Platform.

Create a Sink

  1. Login to the Google Cloud Console and navigate to the Logs Router page. You can perform the above action using the Logs Explorer page as well by navigating to Actions > Create Sink.

  2. Click CREATE SINK.

  3. Enter Sink details:

    a. Sink name (Required: Enter the identifier for Sink.

    b. Sink description (Optional): Describe the purpose, or use case for the Sink.

    c. Click NEXT.

  4. Now, enter Sink destination:

    a. Select Sink service.

    b. Select the service where you want your logs routed. The following services and destinations are available:

    • Cloud Logging logs bucket: Select or create a Logs Bucket.

    • BigQuery: Select or create the particular dataset to receive the exported logs. You also have the option to use partitioned tables.

    • Cloud Storage: Select or create the particular Cloud Storage bucket to receive the exported logs.

    • Pub/Sub: Select or create the particular topic to receive the exported logs.

    • Splunk: Select the Pub/Sub topic for your Splunk service.

    • Select as Other Project: Enter the Google Cloud service and destination in the following format:

      For example, if your export destination is a Pub/Sub topic, then the Sink destination will be as following:
  5. Choose which logs to include in the Sink:

    Build an inclusion filter: Enter a filter to select the logs that you want to be routed to the Sink's destination. For example:

    (resource.type="gcs_bucket" AND
    resource.labels.bucket_name="bucket-to-be-scanned" AND
    (protoPayload.methodName="storage.objects.create" OR protoPayload.methodName="storage.objects.delete" OR
    protoPayload.methodName="storage.objects.get")) OR

    Add all of the bucket names you want to scan in the above filter as resources in Discovery.

    bucket_name="bucket-to-be-scanned" AND

    In case of multiple buckets, you will need to specify it as an “OR” condition, for example:

    (resource.type="gcs_bucket" AND resource.labels.bucket_name="bucket_1" OR resource.labels.bucket_name="bucket_2" OR resource.labels.bucket_name="bucket_3"

    In above example, three buckets are identified to be scanned - bucket_1, bucket_2, bucket_3.

  6. Click DONE.

Cross-project scanning

  • For cross project scanning of GCS & GBQ resources, you need to create a Sink in another project and add the destination as a Pub/Sub topic of project one.

  • You can refer to the same step as mentioned above for creating the Sink in the destination by navigating to Destination > Select as Other project and enter the Pub/Sub topic name in the following format:


  • To access the Sink created in another project, you need to add the Sink writer identity service account in the IAM administration page of the project where you have the Pub/Sub topic and the VM instance present.

  • To get the Sink Writer Identity, perform the following steps:

    • Go to the Logs Router page > select the Sink > select the dots icon > select Edit Sink Details > Writer Identity section, copy the service account.

    • Go to the IAM Administration page of the project where you have the Pub/Sub Topic and the VM instance > select Add member > Add the service account of the Writer Identity of the Sink created above.

    • Choose the role Owner and Editor

    • Click Save. Verify whether the service account which you added is present as a member on the IAM Administration page.

Google Sink configuration properties

  • Add the following properties to the file: vars.pkafka.gcp.yml

  • For the above property, add the Subscription name as the value created in the Pub/Sub Topic.

Note that Subscription ID can be used as the value of the above property. Refer to the following screenshot for more information.