Configure a Bootstrap Script to Retrieve a JWT Token for OLAC in EMR¶

You can configure a bootstrap script with arguments that will be executed by the Privacera Spark Plugin during the plugin’s initialization phase. This bootstrap mechanism allows you to dynamically retrieve a JWT token—for example, by invoking a secure CLI or service—and then write the token to a designated file on the driver node.

This approach eliminates the need to manually provide the JWT token when launching Spark jobs. Instead, the token is fetched and made available automatically at runtime. This not only improves security, but also ensures user-specific token generation is handled seamlessly without hardcoding sensitive values in Spark configuration.

Prerequisites¶

Before configuring the bootstrap action, ensure that the script you intend to execute is available and accessible on the EMR master node.

The script should be pre-installed or placed on the master node before starting the spark session.
It must be located in a path accessible by the user running the Spark job (e.g., /home/hadoop/download_jwt_token.sh or /tmp/download_jwt_token.sh).
The script must have execute permissions (e.g., use chmod +x <script\>).

Configuration¶

To configure the bootstrap script, set the following properties when starting a Spark session in EMR:

Note

If the bootstrap script requires multiple arguments, enclose the entire command in single quotes, as shown in the example below.

Bash
--conf spark.hadoop.privacera.olac.bootstrap.command='</path/to/boostrap_script> <arg1> <arg2>' \
--conf spark.hadoop.privacera.jwt.token=</path/to/jwt_token_file>

Example:

Bash
spark-shell \
  --conf spark.hadoop.privacera.olac.bootstrap.command='/home/hadoop/download_jwt_token.sh test_jwt_user' \
  --conf spark.hadoop.privacera.jwt.token=/home/hadoop/token.txt

Bootstrap Script Execution Flow¶

During plugin initialization, the Privacera Spark Plugin reads the bootstrap command from the Spark configuration.
It then executes the command using a Java ProcessBuilder, which runs the script in a separate process.
If the script fails to execute, the plugin logs an error message to the console or log files and returns a meaningful exit code. This makes it easier to identify and troubleshoot issues.

Prev topic: Advanced Configuration

Configure a Bootstrap Script to Retrieve a JWT Token for OLAC in EMR¶

Prerequisites¶

Configuration¶

Bootstrap Script Execution Flow¶

Comments