Configuring External Hive Metastore (EHM) with AWS EMR Serverless¶
Setup¶
If you are using an External Hive Metastore (EHM) with AWS EMR Serverless and you want to run jobs which needs access to it, then you need to configure the Docker image with the required JDBC driver and connection properties.
After you have configured the Docker image with the required JDBC driver and connection properties, you can run jobs which needs access to the External Hive Metastore. Refer to this section for instructions on how to submit jobs with the database credentials.
To configure the External Hive Metastore (EHM) with EMR Serverless, follow the below steps:
-
Add the following command to the
Dockerfile_Privacera_Spark_OLAC
right after copying the Privacera plugin files:- Locate this line in your Dockerfile:
Bash -
Add the following command to download the required file:
Note
The download URL provided for JDBC driver is only for reference. You can replace it with the URL of the JDBC driver that you want to use
Bash
- Locate this line in your Dockerfile:
-
Build the Docker image and push it to the ECR repository.
Run the job with the External Hive Metastore¶
Refer to this section for instructions on how to submit jobs.