Prerequisites for Discovery on AWS¶
Note
The prerequisites for Privacera Discovery are the same for both Self-Managed and PrivaceraCloud Data Plane deployments.
Privacera Discovery module leverages AWS services such as S3, DynamoDB, and SQS for scanning data. To enable this, you need to create the necessary AWS resources and IAM roles for the Discovery and Portal pods to access these resources. Privacera Manager can create these resources for you or you can create them manually and provide their ARNs during the installation of the Discovery module.
Here are the prerequisites for setting up Privacera Discovery on AWS:
Prerequisites | Description |
---|---|
S3 bucket and path | The S3 bucket and path where the configurations and temporary files for Discovery are stored. |
DynamoDB tables | Used to store metadata and tags. |
SQS | Used when real-time scanning is enabled. The change events for the S3 objects are retrieved from the SQS queue. |
IAM Role for Privacera Manager (optional) | IAM Role to create AWS resources required by Privacera Discovery by Privacera Manager. |
IAM Role for Discovery and Portal pods | IAM Role for Service Account (IRSA) for the Discovery driver, executor, consumer, and portal pods to access AWS resources. |
Assign IAM roles to EKS Service Accounts | Assign the IAM roles to the EKS Service Accounts for the Discovery and Portal pods. |
AWS S3 bucket and path¶
An AWS S3 bucket and path is required to store the configuration for Privacera Discovery. It is recommended to create a dedicated bucket for Privacera Discovery. You can use a sub-folder in the bucket to store the configurations and temporary files for Discovery. You need to make sure that the IAM roles for the Discovery and Portal pods have read/write access to this bucket and/or the sub-folder.
You will need to provide the bucket name and path to Privacera Manager during the installation configuration. This bucket can be created manually or let Privacera Manager create it for you.
Examples are (replace acme
or acme-prod
with your bucket name):
-
Separate bucket for each environment:
s3://acme-prod/privacera-discovery-config/privacera-prod
-
Common bucket, but different folders for each environment:
s3://acme/privacera-discovery-config/privacera-prod
AWS DynamoDB tables¶
AWS DynamoDB tables are required to store the metadata for Privacera Discovery. The recommended naming convention for these tables are privacera_*_DEPLOYMENT_ENV_NAME
. You can create these tables manually or let Privacera Manager create them for you.
Discovery Team: Review the list of table names customers need to create manually (if they decided to)
Table Naming Convention
The table names recommended to be suffixed with the DEPLOYMENT_ENV_NAME (e.g. privacera-prod) to avoid conflicts with other deployments.
Assuming your DEPLOYMENT_ENV_NAME is privacera-prod
, the table names could be suffixed with privacera-prod
as shown above. If you are manually creating the tables, you can use the following naming convention and schema, but replace the DEPLOYMENT_ENV_NAME
with your actual deployment environment name.
The table names and their corresponding hash key, and range key are as follows:
Table Name | Hash Key | Hash Key Type | Range Key | Range Key Type |
---|---|---|---|---|
privacera_scan_requests_privacera-prod | scan_id | S | id | S |
privacera_resource_v2_privacera-prod | appCode | S | id | S |
privacera_alert_privacera-prod | id | S | id | S |
privacera_audit_summary_privacera-prod | appCode | S | id | S |
privacera_active_scans_privacera-prod | topicName | S | id | S |
privacera_state_privacera-prod | id | S |
AWS SQS¶
An AWS SQS queue is used for real-time scanning. The change events for the S3 objects are retrieved from the SQS queue and processed by the Discovery pods. Privacera Manager can create these for you or you can create them manually. The recommended naming convention for the SQS queue is privacera_bucket_sqs_DEPLOYMENT_ENV_NAME
.
Examples are (replace privacera-prod
with your DEPLOYMENT_ENV_NAME):
privacera_bucket_sqs_privacera-prod
IAM Policies For Discovery¶
There are 2 sets of IAM policies required for Privacera Discovery.
- For Privacera Manager: Permissions to create the AWS resources required for Privacera Discovery by Privacera Manager. This is optional. You can have Privacera Manager create it during installation or you can create these resources manually and provide their ARNs during installation of the Discovery module. These IAM policies should be attached to the EC2 instance where Privacera Manager is running.
- For Discovery Services: IAM roles for the Discovery and Portal pods to access the AWS resources. This is mandatory for scanning AWS services such as S3 and DynamoDB. These roles need to be created manually and configured in the Privacera Manager during installation. You can limit the access to only the required resources that will be scanned by Discovery.
Step 1: IAM Role for Privacera Manager¶
You can skip this step if you do not want Privacera Manager to create these resources. However, you will need to create the resources manually and provide their ARNs to Privacera Manager during the configuration steps.
The following additional IAM policy must be attached to the Privacera Manager EC2 instance to enable the creation of AWS resources for Discovery, such as DynamoDB tables, S3 buckets, and SQS queues.
The below IAM policies provide the create and update policies for the following AWS Resources:
Summary of IAM policies for Creating AWS resources for Discovery by Privacera Manager | |
---|---|
After you created the IAM policies, you can attach them to the role used by your Privacera Manager EC2 instance. (e.g. privacera-manager-role-privacera-prod
)
IAM policies for Creating AWS resources for Discovery by Privacera Manager
IAM Policies for Creating AWS resources for Discovery by Privacera Manager¶
Replace the following placeholders
AWS_REGION: The AWS region where the resources are created.
ACCOUNT_ID: The AWS account ID where the resources are created.
DISCOVERY_BUCKET: The S3 bucket name where the Privacera meta-data is stored.
The table name and SQS queue name are in the format [privacera_*_DEPLOYMENT_ENV_NAME]
Step 2: IAM Role for Discovery Services¶
Pod level IAM roles are supported since Privacera Platform version 9.0.0.1. Prior to that you had to give these IAM policies to the nodes of the Kubernetes cluster
Privacera Discovery runs on Apache Spark, and its pods require access to AWS resources to scan data. The IAM roles for the Discovery and Portal pods must be created manually and configured in Privacera Manager during installation.
The Discovery and Portal pods require the following IAM policies to access the AWS resources. The recommendation is to create these policies and attach them to the IAM roles for the Discovery and Portal pods.
Here are the recommended IAM Role names and the policies to be attached to them:
- Role Name:
privacera-discovery-role-DEPLOYMENT_ENV_NAME
(e.g.privcera-discovery-role-privacera-prod
) - Discovery Service Policies:
privacera-discovery-service-policies-DEPLOYMENT_ENV_NAME
(e.g.privacera-discovery-service-policies-privacera-prod
) - Discovery Scan Policies:
privacera-discovery-scan-policies-DEPLOYMENT_ENV_NAME
(e.g.privacera-discovery-scan-policies-privacera-prod
)
The above role will be attached to the following pods (DEPLOYMENT_ENV_NAME
will be replaced with your actual deployment environment name):
- Privacera Portal (In Self-Managed Deployment) or Privacera Discovery Admin Console (In PrivaceraCloud Data Plane Deployment)
- Discovery Driver and Executor pods
- Discover pKafka pod (If real-time scanning is enabled)
graph TD
subgraph RolesAndPolicies
A[privacera-discovery-role]
B[privacera-discovery-service-policies]
C[privacera-discovery-scan-policies]
end
subgraph Pods
D[Portal or\nDiscovery Admin Console]
E[Discovery Spark]
F[pKafka]
end
B --> A
C --> A
A --> D
A --> E
A --> F
a. Discovery Service Policies¶
Summary of IAM policies for reading and writing to S3, DynamoDB and SQS | |
---|---|
After you created the IAM policies, you can attach them to the role used by your Discovery and Portal pods. (e.g. privacera-discovery-role-privacera-prod
)
IAM Policy for Discovery Service
Replace the following placeholders
AWS_REGION: The AWS region where the resources are created.
ACCOUNT_ID: The AWS account ID where the resources are created.
DEPLOYMENT_ENV_NAME: The Privacera deployment environment name.
DISCOVERY_BUCKET: The S3 bucket name where the Privacera meta-data is stored.
b. Discovery Scan Policies¶
It is highly recommended to set this up during the installation phase, but if you won't be scanning data in S3, then you can skip this step.
Summary of read only IAM policies for scanning S3 buckets. | |
---|---|
After you created the IAM policies, you can attach them to the role used by your Discovery and Portal pods. (e.g. privacera-discovery-role-privacera-prod
)
IAM Policy for Discovery Scan
Replace the following placeholders
AWS_REGION: The AWS region where the resources are created.
ACCOUNT_ID: The AWS account ID where the resources are created.
DEPLOYMENT_ENV_NAME: The Privacera deployment environment name.
DISCOVERY_SCAN_BUCKET_NAME1: The S3 bucket name where the data to be scanned is stored.
DISCOVERY_SCAN_BUCKET_NAME2: The S3 bucket name where the data to be scanned is stored.
c. Assign IAM roles to EKS Service Accounts¶
The IAM role (privacera-discovery-role-privacera-prod
) created above should be assigned to the EKS service accounts for the Discovery and Portal pods.
Service Accounts for Discovery and Portal pods | |
---|---|
You can follow the instructions here for creating the IAM role for service accounts.
Final Checklist¶
It is extremely important to ensure that all the prerequisites are met before proceeding with the installation of Privacera Discovery.
- Create IAM policies and roles for Privacera Manager to create AWS resources required for Privacera Discovery (optional).
- Create an S3 bucket and path for storing the configurations and temporary files for Privacera Discovery or let Privacera Manager create it for you.
- Create DynamoDB tables to store metadata and tags or let Privacera Manager create them for you.
- Create an SQS queue for real-time scanning or let Privacera Manager create it for you.
- Create IAM policies and roles for the Discovery and Portal pods to access the AWS resources.
- Create IAM policies for the Discovery and Portal pods to scan the S3 bucket (optional).
- Assign the IAM roles to the EKS Service Accounts for the Discovery and Portal pods.
- Prev Prerequisites
- Next Discovery Setup