Prerequisites for Discovery on AWS¶
Note
The prerequisites for Privacera Discovery are the same for both Self-Managed and PrivaceraCloud Data Plane deployments.
Privacera Discovery module leverages AWS services such as S3, DynamoDB, and SQS for scanning data. To enable this, you need to create the necessary AWS resources and IAM roles for the Discovery and Portal pods to access these resources. Privacera Manager can create these resources for you or you can create them manually and provide their ARNs during the installation of the Discovery module.
Here are the prerequisites for setting up Privacera Discovery on AWS:
Prerequisites | Description |
---|---|
S3 bucket and path | The S3 bucket and path where the configurations and temporary files for Discovery are stored. |
DynamoDB tables | Used to store metadata and tags. |
SQS | Used when real-time scanning is enabled. The change events for the S3 objects are retrieved from the SQS queue. |
IAM Role for Privacera Manager (optional) | IAM Role to create AWS resources required by Privacera Discovery by Privacera Manager. |
IAM Role for Discovery and Portal pods | IAM Role for Service Account (IRSA) for the Discovery driver, executor, consumer, and portal pods to access AWS resources. |
Add trust relationship to the IAM role | The IAM role for the Discovery driver, executor, consumer, and portal pods must be updated with a trust relationship condition to grant access. |
AWS S3 bucket and path¶
An AWS S3 bucket and path is required to store the configuration for Privacera Discovery. It is recommended to create a dedicated bucket for Privacera Discovery. You can use a sub-folder in the bucket to store the configurations and temporary files for Discovery. You need to make sure that the IAM roles for the Discovery and Portal pods have read/write access to this bucket and/or the sub-folder.
You will need to provide the bucket name and path to Privacera Manager during the installation configuration. This bucket can be created manually or let Privacera Manager create it for you.
Examples are (replace acme
or acme-prod
with your bucket name):
- Separate bucket for each environment:
DISCOVERY_BUCKET_NAME: s3://acme-prod/privacera-discovery-config/privacera-prod
- Common bucket, but different folders for each environment:
DISCOVERY_BUCKET_NAME: s3://acme/privacera-discovery-config/privacera-prod
AWS DynamoDB tables¶
AWS DynamoDB tables are required to store the metadata for Privacera Discovery. During installation, Privacera Discovery will create the DynamoDB tables. If you are creating the tables manually, you can use the following naming convention and schema, but replace <DEPLOYMENT_ENV_NAME>
with your actual deployment environment name.
Tip
It is recommended to suffix table names with the DEPLOYMENT_ENV_NAME (e.g., privacera-prod) to avoid conflicts with other deployments.
The table names and their corresponding hash key, and range key are as follows:
Table Name | Hash Key | Hash Key Type | Range Key | Range Key Type |
---|---|---|---|---|
privacera_scan_requests_<DEPLOYMENT_ENV_NAME> | scan_id | S | id | S |
privacera_resource_v2_<DEPLOYMENT_ENV_NAME> | appCode | S | id | S |
privacera_alert_<DEPLOYMENT_ENV_NAME> | id | S | id | S |
privacera_audit_summary_<DEPLOYMENT_ENV_NAME> | appCode | S | id | S |
privacera_active_scans_<DEPLOYMENT_ENV_NAME> | topicName | S | id | S |
privacera_state_<DEPLOYMENT_ENV_NAME> | id | S |
AWS SQS¶
An AWS SQS queue is used for real-time scanning. The change events for the S3 objects are retrieved from the SQS queue and processed by the Discovery pods. Privacera Manager can create these for you or you can create them manually. The recommended naming convention for the SQS queue is privacera_bucket_sqs_DEPLOYMENT_ENV_NAME
.
Examples are (replace privacera-prod
with your DEPLOYMENT_ENV_NAME):
privacera_bucket_sqs_privacera-prod
IAM Policies For Discovery¶
There are 2 sets of IAM policies required for Privacera Discovery.
- For Privacera Manager: Permissions to create the AWS resources required for Privacera Discovery by Privacera Manager. This is optional. You can have Privacera Manager create it during installation or you can create these resources manually and provide their ARNs during installation of the Discovery module. These IAM policies should be attached to the EC2 instance where Privacera Manager is running.
- For Discovery Services: IAM roles for the Discovery and Portal pods to access the AWS resources. This is mandatory for scanning AWS services such as S3 and DynamoDB. These roles need to be created manually and configured in the Privacera Manager during installation. You can limit the access to only the required resources that will be scanned by Discovery.
Step 1: IAM Role for Privacera Manager¶
You can skip this step if you do not want Privacera Manager to create these resources. However, you will need to create the resources manually and provide their ARNs to Privacera Manager during the configuration steps.
The following additional IAM policy must be attached to the Privacera Manager EC2 instance to enable the creation of AWS resources for Discovery, such as DynamoDB tables, S3 buckets, and SQS queues.
The below IAM policies provide the create and update policies for the following AWS Resources:
Summary of IAM policies for Creating AWS resources for Discovery by Privacera Manager | |
---|---|
After you created the IAM policies, you can attach them to the role used by your Privacera Manager EC2 instance. (e.g. privacera-manager-role-privacera-prod
)
IAM policies for Creating AWS resources for Discovery by Privacera Manager
IAM Policies for Creating AWS resources for Discovery by Privacera Manager¶
Replace the following placeholders
AWS_REGION: The AWS region where the resources are created.
ACCOUNT_ID: The AWS account ID where the resources are created.
DISCOVERY_BUCKET_NAME: The S3 bucket name where the Privacera meta-data is stored.
The table name and SQS queue name are in the format [privacera_*_DEPLOYMENT_ENV_NAME]
Step 2: IAM Role for Discovery Services¶
Pod level IAM roles are supported since Privacera Platform version 9.0.0.1. Prior to that you had to give these IAM policies to the nodes of the Kubernetes cluster
Privacera Discovery runs on Apache Spark, and its pods require access to AWS resources to scan data. The IAM roles for the Discovery and Portal pods must be created manually and configured in Privacera Manager during installation.
The Discovery and Portal pods require the following IAM policies to access the AWS resources. The recommendation is to create these policies and attach them to the IAM roles for the Discovery and Portal pods.
Here are the recommended IAM Role names and the policies to be attached to them:
- Role Name:
privacera-discovery-role-DEPLOYMENT_ENV_NAME
(e.g.privcera-discovery-role-privacera-prod
) - Discovery Service Policies:
privacera-discovery-service-policies-DEPLOYMENT_ENV_NAME
(e.g.privacera-discovery-service-policies-privacera-prod
) - Discovery Scan Policies:
privacera-discovery-scan-policies-DEPLOYMENT_ENV_NAME
(e.g.privacera-discovery-scan-policies-privacera-prod
)
The above role will be attached to the following pods (DEPLOYMENT_ENV_NAME
will be replaced with your actual deployment environment name):
- Privacera Portal (In Self-Managed Deployment) or Privacera Discovery Admin Console (In PrivaceraCloud Data Plane Deployment)
- Discovery Driver and Executor pods
- Discover pKafka pod (If real-time scanning is enabled)
Privacera Manager assigns the IAM role (privacera-discovery-role-privacera-prod
) to the EKS service accounts for the relevant pods during installation.
Following kubernetes service accounts will be created by Privacera Manager during the installation process.
Service Accounts for Discovery,Discovery Consumer,Pkafka and Portal pods | |
---|---|
graph TD
subgraph RolesAndPolicies
A[privacera-discovery-role]
B[privacera-discovery-service-policies]
C[privacera-discovery-scan-policies]
end
subgraph Pods
D[Portal or\nDiscovery Admin Console]
E[Discovery Spark]
F[pKafka]
end
B --> A
C --> A
A --> D
A --> E
A --> F
a. Discovery Service Policies¶
Summary of IAM policies for reading and writing to S3, DynamoDB and SQS | |
---|---|
After you created the IAM policies, you can attach them to the role used by your Discovery and Portal pods. (e.g. privacera-discovery-role-privacera-prod
)
IAM Policy for Discovery Service
Replace the following placeholders
AWS_REGION: The AWS region where the resources are created.
ACCOUNT_ID: The AWS account ID where the resources are created.
DEPLOYMENT_ENV_NAME: The Privacera deployment environment name.
DISCOVERY_BUCKET_NAME: The S3 bucket name where the Privacera configuration is stored.
b. Discovery Scan Policies¶
It is highly recommended to set this up during the installation phase, but if you won't be scanning data in S3, then you can skip this step.
Summary of read only IAM policies for scanning S3 buckets. | |
---|---|
After you created the IAM policies, you can attach them to the role used by your Discovery and Portal pods. (e.g. privacera-discovery-role-privacera-prod
)
IAM Policy for Discovery Scan
Replace the following placeholders
DISCOVERY_SCAN_BUCKET_NAME1: The S3 bucket name where the data to be scanned is stored.
DISCOVERY_SCAN_BUCKET_NAME2: The S3 bucket name where the data to be scanned is stored.
c. Add trust relationship to the IAM role¶
The IAM role (privacera-discovery-role-privacera-prod
) must be updated with a trust relationship condition to allow access.
To get Cluster’s OpenID to update the trust relationship, you can follow the below steps:
- Go to AWS Console > Amazon Elastic Kubernetes Service.
- Click on your cluster name.
- Copy the OpenID Connect Provider URL (e.g.
https://oidc.eks.us-east-1.amazonaws.com/id/D3B53XXXXXXXXXXXXXXXXXXXXD4357F
). - Replace the id in federated section with the OpenID connect number.(e.g.
arn:aws:iam::<ACCOUNT_ID>:oidc-provider/oidc.eks.<AWS_REGION>.amazonaws.com/id/<OPEN_ID_CONNECT_NUMBER>
)
For more information on OIDC, refer to the OpenID Connect documentation
Trust Relationship Policy for Discovery IAM Role
Replace the following placeholders
AWS_REGION: The AWS region where the resources are created.
ACCOUNT_ID: The AWS account ID where the resources are created.
DEPLOYMENT_ENV_NAME: The Privacera deployment environment name.
DISCOVERY_ROLE_NAME: The IAM role name for Discovery,Portal,Pkafka pods.
OPEN_ID_CONNECT_NUMBER: The OpenID connect number from the OpenID connect provider URL.
Final Checklist¶
It is extremely important to ensure that all the prerequisites are met before proceeding with the installation of Privacera Discovery.
- Create IAM policies and roles for Privacera Manager to create AWS resources required for Privacera Discovery (optional).
- Create an S3 bucket and path for storing the configurations and temporary files for Privacera Discovery or let Privacera Manager create it for you.
- Create DynamoDB tables to store metadata and tags or let Privacera Manager create them for you.
- Create an SQS queue for real-time scanning or let Privacera Manager create it for you.
- Create IAM policies and roles for the Discovery and Portal pods to access the AWS resources.
- Create IAM policies for the Discovery and Portal pods to scan the S3 bucket (optional).
- Add trust relationship to the IAM role
And you have the values for the following placeholders:
- DISCOVERY_BUCKET_NAME: Discovery configuration bucket name and path.
- DynamoDB table names for storing metadata and tags (only if you have created the tables manually).
- SCAN_REQUEST_TABLE: DynamoDB table name for storing scan requests
- RESOURCE_TABLE: DynamoDB table name for storing resource metadata
- ALERT_TABLE: DynamoDB table name for storing alerts
- AUDIT_SUMMARY_TABLE: DynamoDB table name for storing audit summary
- ACTIVE_SCANS_TABLE: DynamoDB table name for storing active scans
- STATE_TABLE: DynamoDB table name for storing state
- DISCOVERY_BUCKET_SQS_NAME: Amazon SQS Queue name (only if you have created the queue manually).
- IAM Role For Discovery and Portal Pods: ARN of the IAM role created for Discovery driver, executor and Portal pods. You can set the same value as
privacera-discovery-role-privacera-prod
from prerequisites.
- Prev Prerequisites
- Next Discovery Setup