Setup for Access Management for EMR Serverless¶
Configure¶
Perform following steps to configure EMR Serverless connector:
-
SSH into the instance where Privacera Manager is installed.
-
Navigate to the
/config
directory by running the following command:Bash -
Copy the sample variables by running the following command:
Bash -
Open the
.yml
file for editing by running the following command:Bash -
Modify the following properties. You can get the supported versions from the AWS EMR Serverless from AWS EMR Serverless Versions
Bash -
Once the properties are configured, update your Privacera Manager platform instance by following the commands
-
Once the
post-install
process is complete, you will see emr-serverless folder in the ~/privacera/privacera-manager/output directory, with the following folder structure: -
To build and push the Docker image, you need to copy the following Docker files and other configuration files to the EC2 instance where you can build the Docker image or you can build in the same EC2 instance where Privacera Manager is installed.
-
Once the required files are on the EC2 instance where you can build the Docker image, run the following command. You can set the following environments variables before running the command or replace the values in the command itself.
Here are some global variables that you need to set before running the command:
Variable Name Description Sample Value aws_account_id
Your AWS account ID. "123456789012"
region
The AWS region where your ECR repository is located. "us-east-1"
ecr_repo_name
The name of your ECR repository where the Docker image will be pushed. "privacera/emr-serverless-spark-olac"
tag
The tag for the Docker image. v1.0
You can set the following environments variables before running the command or replace the values in the command itself.
-
To verify that the Docker image was created successfully, run the following command. Make sure to set the environment variables or replace the variables before running the command.
Bash Once inside the container, you can inspect the environment to ensure it’s set up correctly. Run
exit
to exit. This should also delete the container since the--rm
flag was used. -
Run the following command to push the docker image to the Amazon Elastic Container Registry (ECR) repository. Make sure to set the environment variables or replace the variables before running the command.
Bash Note
Make sure you have the necessary IAM permissions to manage customized Docker image in your Amazon Elastic Container Registry (ECR).
-
Once the docker image is pushed to ECR, you will be able to see the docker image in the ECR repository.
Create Application¶
With EMR Serverless, you can create one or more applications that use open-source analytics frameworks. To create an application, follow these steps:
Note
Refer to the latest AWS documentation for deploying EMR Serverless applications.
- Application settings: Provide a unique name for the application (e.g.,
emr_serverless_spark_app
). Select type asSpark
, and specify the release version that you have configured in thevars.emr-serverless.yml
file. - Custom Image Settings: Select the image that you have uploaded in ECR repository.
-
Application Configuration: Edit the JSON with the following configuration:
JSON configuration:
Next Steps¶
To submit a job to the EMR Serverless application, refer to the Privacera's User Guide for AWS EMR Serverless
- Prev topic: Prerequistes
- Next topic: Advanced Configuration