Skip to main content

PrivaceraCloud Documentation

Manually setup EMR cluster

:
  1. Login to AWS Console and navigate to EMR service and click Create Cluster.

  2. Click Go to advanced options link.

  3. Under the Software Configuration:

  4. Select Release Version.

  5. Select additional applications as per your environment.

    If you select Hive or Spark applications, then it is mandatory to select HCatalog option.

  6. Under the Edit software settings, select the Enter configuration, and add the following text if you want to use external Hive Metastore.

    Glue Metastore is not supported.

    [
        {
                "Classification": "hive-site",
                "Properties": {
                "javax.jdo.option.ConnectionUserName": "${user-name}",
                "javax.jdo.option.ConnectionDriverName": "${jdbc-driver}",
                "javax.jdo.option.ConnectionURL": "${jdbc-url}",
                "javax.jdo.option.ConnectionPassword": "${jdbc-password}"
                }
       }
    ]
     
  7. Click Next.

  8. Under the Hardware settings, select values Networking, Node, and Instance values as appropriate for your environment.

  9. Under the General cluster settings.

    If you want to enable Audit logging for your applications in Privacera Portal, perform the following. It will add two scripts that will Install Ranger Audits Configurations on master and worker nodes.

  10. Enter the Cluster name.

  11. Select Logging, Debugging, and Termination protection checkboxes as per your environment.

  12. Configure Ranger Audits logging for Master Node:

  13. Under Additional Options, expand Bootstrap Actions, select bootstrap action Run if and click Configure and add.

    The Add Bootstrap Action dialog appears.

  14. In this dialog, enter the name to Configure Ranger Audits for Master.

  15. Add the following script in the Optional arguments field using your own {ranger-audit-setup-script-url} script URL.

    {ranger-audit-setup-script-url}: PCloud Portal > Access Manager > Settings > ApiKey > Click Info Icon > Ranger Audit Setup Script > Copy URL.

                instance.isMaster=true "wget <ranger-audit-setup-script-url>; chmod +x ./privacera_emr_native.sh ; sudo ./privacera_emr_native.sh"
                
  16. Click Add.

  17. Configure Ranger Audits for Worker nodes.

  18. Under Additional Options, expand Bootstrap Actions, select bootstrap action Run if and click Configure and add.

    The Add Bootstrap Action dialog appears.

  19. In this dialog, enter the name to Configure Ranger Audits for Master.

  20. Add the following script in the Optional arguments field using your own {ranger-audit-setup-script-url} script URL.

    {ranger-audit-setup-script-url}: PCloud Portal > Access Manager > Settings > ApiKey > Click Info Icon > Ranger Audit Setup Script > Copy URL.

    instance.isMaster=false "wget <ranger-audit-setup-script-url>; chmod +x ./privacera_emr_native.sh ; sudo ./privacera_emr_native.sh"
    
  21. Click Add.

  22. Under Security Options:

  23. Enter/select Security Options as per your environment.

  24. Under the Permissions section:

  25. EMR role: The EMR_EC2_Default role need to be selected.

    EC2 instance profile: “EMR_RS_INSTANCE_ROLE” created during IAM Roles setup.

  26. Expand Security Configuration, and select the configuration which you created earlier. E.g. "EMR_NATIVE_WITH_PLCOUD".

    Set Realm and enter a KDC admin password.

  27. Click the Create cluster.