Skip to content

Configuring externalizing the sensitive values of privacera properties

The privacera_spark.properties file contains sensitive information required by the Spark plugin deployed in EMR Serverless. To avoid including this sensitive data in the Docker image, you can externalize the values from privacera_spark.properties by following these steps:

  1. Set the following key in vars.emr-serverless.yml

    YAML
    EMR_SERVERLESS_EXTERNALIZE_PRIVACERA_SPARK_PROPERTIES: "true"
    

  2. Run post-install

    Bash
    cd ~/privaera/privacera-manager
    ./privacera-manager.sh post-install
    

  3. Create the docker image and push it to the ECR. Refer to the Create docker image section for more details.

  4. Once your post install is done, you can find the values for the above properties as follows:

    Property Value can be read from
    spark.hadoop.privacera.signer.base.url vi ~/privacera/privacera-manager/output/service-urls.txt
    You can copy the EXTERNAL URL under DATASERVER
    spark.hadoop.privacera.signer.truststore.password vi ~/privacera/privacera-manager/config/custom-vars/vars.ssl.yml
    You can copy the value set to the variable SSL_DEFAULT_PASSWORD.
    By default, the password is set as changeit.
    spark.hadoop.privacera.signer.truststore.type vi ~/privacera/privacera-manager/config/custom-vars/vars.ssl.yml
    You can copy the value set to the variable SSL_SIGNED_CERT_FORMAT.
    By default, the format is set as PKCS12.
    spark.hadoop.privacera.clusterName This is the name you have set for the EMR cluster.
  5. Edit the EMR serverless application and update below properties.

  6. Update the EMR Serverless application configuration and add below Privacera-specific properties to the spark-defaults section.

    JSON
    1
    2
    3
    4
    "spark.hadoop.privacera.signer.base.url": "<privacera_signer_url>",
    "spark.hadoop.privacera.signer.truststore.password": "<privacera_truststore_password>",
    "spark.hadoop.privacera.signer.truststore.type": "<privacera_truststore_type>",
    "spark.hadoop.privacera.clusterName": "<application_name>"
    
    JSON
    1
    2
    3
    4
    "spark.hadoop.privacera.signer.base.url": "EMR.secret@<privacera_signer_url_secret_name>",
    "spark.hadoop.privacera.signer.truststore.password": "EMR.secret@<privacera_truststore_password_secret_name>",
    "spark.hadoop.privacera.signer.truststore.type": "EMR.secret@<privacera_truststore_type_secret_name>",
    "spark.hadoop.privacera.clusterName": "EMR.secret@<application_name_secret_name>"
    

    Creating an AWS Secrets Manager secret

    • Go to AWS Secrets Manager.
    • Choose Store a new secret.
    • Under Choose secret type, choose Other type of secret.
    • Click on the Plaintext tab, and paste the value of the privacera property.
    • For Encryption key, select your AWS KMS key. Click Next.
    • Provide a proper name for the secret.
    • Add a description for the secret.
    • Under Resource Permissions, click Edit permissions. Refer the below policy for the required permissions.
      JSON
      {
        "Version" : "2012-10-17",
        "Statement" : [
          {
            "Effect" : "Allow",
            "Principal" : {
              "Service" : "emr-serverless.amazonaws.com"
            },
            "Action" : [ "secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret" ],
            "Resource" : "<arn-of-the-secret>",
            "Condition" : {
              "StringLike" : {
                "aws:SourceArn" : "<arn-of-the-serverless-application>"
              }
            }
          } 
        ]
      }
      
    • Click Next.
    • Click Next again to proceed through the default options.
    • Review the details and click Store.
  7. Save the application configuration. Now, you can start the EMR Serverless application.


Comments