Skip to content

Configure JWT in Hive Metastore (HMS) for OLAC in EMR

If you are using Hive Metastore (HMS) with OLAC in Amazon EMR, you need to configure the JWT token in the HMS server so that while creating tables, the HMS can create the necessary folders and object in AWS S3. When Privacera is integrated with Hive Metastore, then the call to create the folders and objects goes through the Privacera DataServer and Privacera ensures that the user has the necessary permissions to create the objects in S3. This eliminates the need to give excessive IAM permissions to HMS.

This requires that JWT token is configured in the Hive Metastore (HMS) for OLAC

⚠ Limitations

This feature doesn't control the access to create the tables in the HMS. It is important to ensure that any job submitted in the EMR cluster is trusted and code is properly reviewed before running it to avoid any security issues.

Setup

You need to create a script that will update the JWT token in the HMS server. The script needs to be uploaded to an S3 which can be accessed by the EMR cluster. The script will be executed as an EMR step to update the JWT token in the HMS

Create a script to update the JWT token in HMS

To update the JWT token in the Hive Metastore (HMS), create a script named update_jwt_token_in_hms.sh with the content provided below, and upload it to S3 in a location that can be accessed by the EMR cluster.

Bash
#!/bin/bash
set -x

# Check if an argument is provided
if [ "$#" -eq 0 ]; then
echo "No argument provided. Usage: ./update_jwt_token_in_hms.sh <jwt-token>"
exit 1
fi

export hcat_hive_site=/etc/hive-hcatalog/hcat-conf/hive-site.xml
export hcat_env=/etc/hive-hcatalog/conf/hcat-env.sh
export jwt_token=${1}

echo "Adding jwt token ${jwt_token} in ${hcat_hive_site}"

restart_hive_hcatalog() {
if systemctl is-active --quiet hive-hcatalog-server; then
  echo "hive-hcatalog-server is running. Restarting the service now..."
  sudo systemctl restart hive-hcatalog-server
else
  echo "hive-hcatalog-server is not running."
fi
}

update_jwt_token() {
sudo sed -i "s/<\/configuration>//g" ${hcat_hive_site}
sudo -E bash -c 'cat <<EOF >>${hcat_hive_site}
<property>
<name>privacera.ds.jwt.auth.token</name>
<value>${jwt_token}</value>
</property>

</configuration>
EOF'
}

update_hcat_env() {
if [ -f ${hcat_env} ]; then
  if ! grep -qxF 'export HIVE_CONF_DIR=/etc/hive-hcatalog/hcat-conf' ${hcat_env}; then
    echo '' | sudo tee -a ${hcat_env}
    echo 'export HIVE_CONF_DIR=/etc/hive-hcatalog/hcat-conf' | sudo tee -a ${hcat_env}
    echo "Added HIVE_CONF_DIR in ${hcat_env}"
  fi
  sudo cat ${hcat_env} 1>&2
fi
}

update_jwt_token
update_hcat_env
restart_hive_hcatalog
echo "Successfully updated jwt token in ${hcat_hive_site}"

Steps while creating the EMR cluster

To configure the JWT token in Hive Metastore (HMS) for OLAC, add the following EMR Step in emr-template.json file. This step will update the JWT token on the HMS server located on the EMR Master Node.

Replace the following placeholders in the below code snippet:

  • <path_to_your_file>: The S3 path where the update_jwt_token_in_hms.sh script is uploaded.
  • <UPDATE_JWT_TOKEN>: The JWT token that you want to configure in the HMS.
  • EMRCLUSTER: The reference to the EMR cluster.
Bash
"ConfigureJWTinHMS": {
    "Type": "AWS::EMR::Step",
    "Properties": {
      "ActionOnFailure": "CONTINUE",
      "HadoopJarStep": {
        "Args": [
          {
            "Fn::Sub": "s3://<path_to_your_file>/update_jwt_token_in_hms.sh"
          },
          {
            "Fn::Sub":"<UPDATE_JWT_TOKEN>"
          }
        ],
        "Jar": {
          "Fn::Sub": "s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar"
        }
      },
      "Name": "ConfigureJWTinHMS",
      "JobFlowId": {
        "Ref": "EMRCLUSTER"
      }
    }
}

Comments