Configuring EMR to use External Hive Metastore¶
If you are using an External Hive Metastore (EHM) with AWS EMR, then you need to follow the below steps to configure Privacera to use the External Hive Metastore in EMR
This is for AWS EMR. For AWS EMR Serverless, refer to the EMR Serverless documentation.
-
SSH to the instance where Privacera is installed.
-
Run the following command to navigate to the /config directory.
Bash -
Run the following command to open the .yml file to be edited.
Bash -
Modify the following properties:
Variable Definition EMR_HIVE_METASTORE Set to 'hive' to enable External Hive Metastore EMR_HIVE_METASTORE_CONNECTION_URL Set the JDBC Connection URL (ex: jdbc:mysql:// :3306/ ?createDatabaseIfNotExist=true) EMR_HIVE_METASTORE_CONNECTION_DRIVER Set JDBC Driver Name (ex: "org.mariadb.jdbc.Driver") EMR_HIVE_METASTORE_CONNECTION_USERNAME Set the JDBC username EMR_HIVE_METASTORE_CONNECTION_PASSWORD Set the JDBC password -
Once the properties are configured, run the following commands to update your Privacera Manager platform instance:
Step 1 - Setup which generates the helm charts. This step usually takes few minutes.
Step 2 - Apply the Privacera Manager helm charts. Step 3 - Post-installation step which generates Plugin tar ball, updates Route 53 DNS and so on. -
After the
post-install
, create a new cluster with newly generated emr-template.json file from output directory.
Update hive-site configuration in emr template as below and create new emr cluster with this template.
privacera-emr-hive-site
- Prev topic: Advanced Configuration