Skip to content

JWT Token User Identity

Overview

This feature allows the use of JWT tokens to carry the user identity information required by Privacera to enforce access control. This works for certain connectors or use-cases where the data source may not be able to pass the user identity reliably to Privacera.

Connectors

The following connectors support the use of JWT tokens to carry user identity information:

  • OLAC connectors
    • AWS EMR (on EC2) Spark OLAC connector without Kerberos - JWT token user identity is the only supported way to enforce access control in a non-Kerberos EMR (on EC2) cluster.
    • AWS EMR-Serverless Spark OLAC connector without Lake Formation - JWT token user identity allows you to use Privacera for access control in a non-Lake Formation EMR-Serverless cluster without using IAM roles for user identity.
    • Databricks Standard Cluster OLAC connector - JWT token user identity is an additional way to pass user identity to Privacera for access control if you don't want to use the logged-in user identity.
    • Apache Spark on EKS OLAC connector - JWT token user identity is the only supported way to enforce access control in Apache Spark on EKS cluster.
  • FGAC connectors
    • Databricks High Concurrency Cluster FGAC connector - JWT token user identity is an additional way to pass user identity to Privacera for access control if you don't want to use the logged-in user identity.

Supported Deployments

  • PrivaceraCloud
  • Self Managed Deployment
  • PrivaceraCloud Data-plane Deployment

Prerequisites

You need to have a JWT token generation capability in your identity provider (IdP) to generate the JWT token. The JWT token is signed by your IdP and contains the user identity information. The user is configured in Privacera with the same username. The public key of the IdP is used to validate the JWT token. It is either configured statically in Privacera or provided dynamially through a JWKS endpoint which is configured in Privacera.

For OLAC use-case, you need to have Privacera Dataserver configured and running, to which we will add the additional configuration to validate JWT token.

Sample Flow for OLAC

sequenceDiagram
    participant User
    participant IdentityProvider
    participant ComputeEnv as Compute Env + Privacera OLAC Plugin 
    participant PrivaceraDataServer
    participant CloudStorage

    User->>IdentityProvider: 1. Request JWT token
    IdentityProvider-->>User: 2. Provide JWT token
    User->>ComputeEnv: 3. Pass JWT token
    ComputeEnv->>PrivaceraDataServer: 4. Send JWT token
    PrivaceraDataServer->>PrivaceraDataServer: 5. Validate JWT token using static key or key from JWK endpoint
    PrivaceraDataServer->>PrivaceraDataServer: 6. Generate Signed URL/STS token
    PrivaceraDataServer-->>ComputeEnv: 7. Provide Signed URL/STS token
    ComputeEnv->>CloudStorage: 8. Access data using Signed URL/STS token
    CloudStorage-->>ComputeEnv: 9. Data retrieved

Diagram Explanation

  1. Request JWT Token: The user requests a JWT token from the Identity Provider (IdP).
  2. Provide JWT Token: The IdP provides the JWT token to the user.
  3. Pass JWT Token: The user passes the JWT token to the compute environment.
  4. Send JWT Token: The compute environment sends the JWT token to Privacera DataServer.
  5. Validate JWT Token: Privacera DataServer validates the JWT token signature by using either IdP public key that is statically configured or is obtained dynamically from IdP's JWKS endpoint.
  6. Generate Signed URL/STS Token: Privacera DataServer generates a Signed URL or STS token.
  7. Provide Signed URL/STS Token: Privacera DataServer provides the Signed URL or STS token to the compute environment.
  8. Access Data: The compute environment accesses data from cloud storage using the Signed URL or STS token.
  9. Data Retrieved: The data is retrieved from the cloud storage and provided to the compute environment.

Sample Flow for FGAC

sequenceDiagram
    participant User
    participant IdentityProvider
    participant ComputeEnv as Compute Env + Privacera FGAC Plugin 
    participant CloudStorage

    User->>IdentityProvider: 1. Request JWT token
    IdentityProvider-->>User: 2. Provide JWT token
    User->>ComputeEnv: 3. Pass JWT token
    ComputeEnv->>ComputeEnv: 4. Privacera FGAC plugin validates JWT token using static key or key from JWK endpoint
    ComputeEnv->>ComputeEnv: 5. Privacera FGAC plugin uses identity to enforce access control
    ComputeEnv->>CloudStorage: 6. Access data using Compute Env native permissions (IAM role)
    CloudStorage-->>ComputeEnv: 7. Data retrieved

Diagram Explanation

  1. Request JWT Token: The user requests a JWT token from the Identity Provider (IdP).
  2. Provide JWT Token: The IdP provides the JWT token to the user.
  3. Pass JWT Token: The user passes the JWT token to the compute environment.
  4. Validate JWT Token: Privacera FGAC plugin validates the JWT token signature by using either IdP public key that is statically configured or is obtained dynamically from IdP's JWKS endpoint.
  5. Enforce Access Control: Privacera FGAC plugin uses the user identity to enforce access control.
  6. Access Data: The compute environment accesses data from cloud storage using the compute environments native permissions (IAM role).
  7. Data Retrieved: The data is retrieved from the cloud storage and provided to the compute environment.

Concepts

JWT Token Format

A JSON Web Token (JWT) consists of three Base64 strings separated by dots (.). These 3 parts are header, payload and signature. The header and payload are JSON objects, and the signature is a computed over the header and payload using a secret key. The signature is used to confirm the identity of the issuer and the integrity of the JWT token.

The header contains the algorithm used to sign the JWT token. An example JWT header JSON is shown below. All the values are examples and should not be used as is.

JSON
1
2
3
4
5
{
  "alg": "RS256",
  "typ": "JWT",
  "kid": "1234567890"
}

The fields in the header are as follows,

  1. The alg field is the algorithm used to sign the JWT token. Privacera supports only RSA256 and ECDSA256 algorithms for JWT token signature, which correspond to RS256 and ES256 as values of this field.
  2. The typ field is the type of the token. This is a literal value and is always JWT.
  3. The kid field is the key id of the public key used to sign the JWT token. This is an optional field. It is present if JWKS endpoint is used to fetch the public key.

The payload contains the claims. An example JWT payload JSON is shown below. All the values are examples and should not be used as is.

JSON
{
  "iss": "https://testidp.example.com/issuer/websec",
  "sub": "infra_test_user",
  "iat": "1721223184",
  "exp": "1721283133",
  "aud": "https://dataserver.example.com",
  "scope": [
    "infra_test_group"
  ]
}
The fields in the payload are as follows:

  1. The iss field is the issuer of the JWT token. This value is configured in Privacera so that it can be used to obtain the configuration for validating the JWT token. This is a mandatory field. Typically, it is in the format of a URL, but it is a literal value and no connection attempt will be made to this URL.
  2. The sub field is the subject of the JWT token. This is the user identity that Privacera should use to enforce access control. This is a mandatory field. You can configure another key in the payload to be used as the user identity.
  3. The iat field is the issued at time of the JWT token. This is the time when the token was issued in Unix time. This is a mandatory field. The token is rejected if current time is before this time.
  4. The exp field is the expiration time of the JWT token. This is the expiry time of the token in Unix time. This is a mandatory field. The token is rejected if the current time is after this time.
  5. The aud field is the audience of the JWT token. This is the intended recipient of the token, which is Privacera Dataserver. This is a string that is configured in Privacera and Privacera will use the token only if it matches. This is an optional field.
  6. The scope field is used to carry additional list of groups. This is an optional field. You can configure another key in the payload to be used as the group list. The groups can be either space separated or comma separated. These groups can be used to override the user's groups that are configured in Privacera or to add additional groups. TODO: need the properties

All other fields in the payload will be ignored by Privacera.

Token Duration

For OLAC jobs, the token duration can be short as it is used only during the startup of the job to pass the identity to the Privacera Dataserver. For FGAC jobs, the token duration should be long enough to cover the duration of the job.

JWT Signature Verification

JWT Signature verification is done using the public key of the IdP. The public key can be configured statically in Privacera or dynamically fetched from the IdP's JWKS endpoint.

Privacera supports only RSA256 and ECDSA256 algorithms for JWT token signature.

In case of dynamic public key configuration, the public key is fetched from the IdP's JWKS (JSON Web Key Set) endpoint using the kid field in the JWT header. The JWKS service returns a set of keys containing public keys used to verify the JWT token. The endpoint could return a set of keys or one specific key given the kid field in the JWT header.

Here is an example of JWKS with RSA JWK returned by the JWKS service -

JSON
{
"keys": [
  {
    "alg": "RS256",
    "kty": "RSA",
    "use": "sig",
    "x5c": [
      "your-x509-cert-chain"
    ],
    "n": "your-rsa-public-modulus",
    "e": "your-base64url-encoded-exponent",
    "kid": "your-unique-key-id",
    "x5t#S256": "your-unique-thumbprint-sha256",
    "exp": "your-expiration-time"
  }
]}
The various keys are described below as -

  1. alg - Algorithm used to sign the JWT token. It is either RS256 or HS256.
  2. kty - Key type. It is either RSA or EC.
  3. use - Use of the key. It is either sig or enc.
  4. x5c - X.509 certificate chain. It is an array of base64 encoded X.509 certificates.
  5. n - RSA modulus. It is a base64 encoded string.
  6. e - RSA exponent. It is a base64 encoded string.
  7. kid - Key ID. It is a string identifier.
  8. x5t#S256 - X.509 certificate SHA-1 thumbprint. It is a base64 encoded string.
  9. exp - Expiration time of the key. It is a Unix time.

Here is an example of JWKS with ECDSA JWK returned by the JWKS service -

JSON
{
"keys": [
  {
    "kty": "EC",
    "crv": "P-256",
    "x": "your-x-coordinate",
    "y": "your-y-coordinate",
    "kid": "your-unique-key-id",
    "exp": "your-expiration-time"
  }
]}
The various keys are described below as -

  1. kty - Key type. It is either RSA or EC.
  2. crv - Curve used for the key. It is a string.
  3. x - X coordinate of the key. It is a base64 encoded string.
  4. y - Y coordinate of the key. It is a base64 encoded string.
  5. kid - Key ID. It is a string identifier.
  6. exp - Expiration time of the key. It is a Unix time.

The Privacera Dataserver will obtain the key from the JWKS service endpoint when a JWT token with key id is received. This key will be cached for it's expiration duration.

Using JWT Token User Identity Feature in Privacera

To use this feature you need to do the following:

  1. For OLAC supported connectors
    1. Configure Privacera Dataserver to use JWT tokens
    2. Configure EMR, Databricks or Apache Spark plugin to use JWT token
    3. At runtime, generate JWT token and pass it to the Spark job
  2. For FGAC supported connectors
    1. Configure the Databricks Spark plugin to use JWT token
    2. At runtime, generate JWT token and pass it to the Spark job

Using JWT Tokens

For OLAC supported connectors

User will pass the JWT token string in a Spark configuration variable to the Spark job. Here is an example -

Bash
1
2
3
spark-sql \
--conf "spark.hadoop.privacera.jwt.token.str=<JWT_TOKEN>" \
--conf "spark.hadoop.privacera.jwt.oauth.enable=true"

Token Visibility in logs

If the JWT token is passed as a Spark configuration variable on command line then the value is redacted by Apache Spark running on EMR, Databricks and Apache Spark since the variable contains the word token.

For FGAC supported connectors

User will copy the JWT token string to a file and pass the file path in a Spark configuration variable to the Spark job. The difference is the methodology is because FGAC clusters support SparkSQL and it is not possible to pass JWT in Spark configuration variable.

Global User

When this feature is used FGAC cluster, then the logged-in user identity is not considered and everyone will be treated as the user in the JWT token. This is only recommended for job clusters where you want to enforce FGAC.

Here is an example -

Properties
spark.hadoop.privacera.jwt.oauth.enable true
spark.hadoop.privacera.jwt.token /tmp/jwttoken.dat

You can copy the JWT token file to Spark cluster using the following steps:

Python
1
2
3
4
5
file_path="/tmp/ptoken.dat"
token="<jwt_token>"
file1 = open(file_path,"w")
file1.write(token)
file1.close()

Configuring Privacera for JWT Token User Identity

You need to enable and configure the JWT token User Identity feature in Privacera. This is a common configuration for

  • OLAC and FGAC connectors on Self Managed and Data Plane deployments
  • OLAC connectors on PrivaceraCloud

The configuration maps the token issuer value to a set of configurations that are used to validate the JWT token. You can configure multiple token issuers in Privacera.

Setup for JWT public key configuration

On the host where Privacera Manager is installed, do the following steps:

Bash
1
2
3
cd ~/privacera/privacera-manager
cp -n config/sample-vars/vars.jwt-auth.yaml config/custom-vars
vi config/custom-vars/vars.jwt-auth.yaml
Edit the file and modify the JWT_CONFIGURATION_LIST as given in next section.

If you are doing static public key configuration, then you need the public key in PEM format in a file. All the such public key files should be copied to the privacera/privacera-manager/config/custom-vars directory.

After all the changes are done, run the Privacera Manager by following these steps.

Fallback for Databricks OLAC and FGAC connectors

Copying the vars.jwt-auth.yaml to the config/custom-vars directory will enable JWT User Identity for all OLAC and FGAC connectors. If you want to continue using the logged-in user identity for Databricks OLAC and FGAC connectors, then you need to set this property by creating a new files in the config/custom-properties/privacera_spark_custom.properties directory.

Bash
vi ~/privacera/privacera-manager/config/custom-properties/privacera_spark_custom.properties
Add or edit this property in the file and set it to true.
Properties
privacera.jwt.dbx.login.user.fallback.enable=true

To enable JWT token for User Identity in PrivaceraCloud, you need to add below properties in s3 application.

Navigate to Goto Settings >> Applications >> s3 >> Click on edit

Now click on Access Management from pop-up and navigate to Advanced properties section

Add the properties given in the next section and click on save button.

Static public key configuration

For static public key configuration, here are some sample configurations that you can put in the vars.jwt-auth.yaml file. Typically you will have only one configuration, but in some cases you may have multiple configurations. The meaning of these properties is explained in the Reference section .

YAML
JWT_CONFIGURATION_LIST:

  - index: 0
    issuer: "https://your-idp-domain.com/websec1"
    userKey: "sub"
    groupKey: "scope"
    parserType: "PING_IDENTITY"
    publickey: "jwttoken1.pub"
    audience: "https://dataserver.example.com"

  - index: 1
    issuer: "https://your-idp-domain.com/websec2"
    userKey: "client_id"
    groupKey: "scope"
    parserType: "KEYCLOAK"
    publickey: "jwttoken2.pub"

  - index: 2
    issuer: "https://your-idp-domain.com/websec2"
    userKey: "client_id"
    parserType: "KEYCLOAK"
    publickey: "jwttoken3.pub
Properties
privacera.jwt.oauth.enable=true

privacera.jwt.0.token.issuer=https://your-idp-domain.com/websec1
privacera.jwt.0.token.publickey=<public_key_in_string_format>
privacera.jwt.0.token.userKey=sub
privacera.jwt.0.token.groupKey=scope
privacera.jwt.0.token.parserType=PING_IDENTITY

privacera.jwt.1.token.issuer=https://your-idp-domain.com/websec2
privacera.jwt.1.token.publickey=<public_key_in_string_format>
privacera.jwt.1.token.userKey=client_id
privacera.jwt.1.token.groupKey=scope
privacera.jwt.1.token.parserType=KEYCLOAK

Dynamic public key configuration

YAML
JWT_CONFIGURATION_LIST:
  - index: 0
    issuer: "https://example.com/issuer"
    userKey: "sub"
    groupKey: "scope"
    parserType: "PING_IDENTITY"

    pubKeyProviderEndpoint: "https://<JWKS-provider>/get_public_key?kid="
    pubKeyProviderAuthType: "BASIC"
    pubKeyProviderAuthUserName: "<username>"
    pubKeyProviderAuthTypePassword: "<password>"
    pubKeyProviderJsonResponseKey: "x5c"
    jwtTokenProviderKeyId: "kid"
Properties
privacera.jwt.oauth.enable=true

privacera.jwt.0.token.issuer=https://example.com/issuer
privacera.jwt.0.token.userKey=sub
privacera.jwt.0.token.groupKey=scope
privacera.jwt.0.token.parserType=PING_IDENTITY
privacera.jwt.0.token.publickey.provider.url=https://<JWKS-provider>/get_public_key?kid=
privacera.jwt.0.token.publickey.provider.auth.type=BASIC
privacera.jwt.0.token.publickey.provider.auth.username=<username>
privacera.jwt.0.token.publickey.provider.auth.password=<password>
privacera.jwt.0.token.provider.response.key=x5c
privacera.jwt.0.token.provider.key.id=kid

Dynamic public key configuration (Without Basic Authentication)

YAML
1
2
3
4
5
6
7
8
9
JWT_CONFIGURATION_LIST:
-   index: 0
    issuer: "https://example.com/issuer"
    userKey: "client_id"
    groupKey: "scope"
    parserType: "PING_IDENTITY"
    pubKeyProviderEndpoint: "https://my-sat-server/<api-to-get-public-key-by-kid>/"
    pubKeyProviderJsonResponseKey: "x5c"
    jwtTokenProviderKeyId: "kid
Properties
1
2
3
4
5
6
7
8
9
privacera.jwt.oauth.enable=true

privacera.jwt.0.token.issuer=https://example.com/issuer
privacera.jwt.0.token.userKey=client_id
privacera.jwt.0.token.groupKey=scope
privacera.jwt.0.token.parserType=PING_IDENTITY
privacera.jwt.0.token.publickey.provider.url=https://<JWKS-provider>/get_public_key?kid=
privacera.jwt.0.token.provider.response.key=x5c
privacera.jwt.0.token.provider.key.id=kid

Reference for JWT_CONFIGURATION_LIST

Reference

These properties configure the payload of the JWT token:

  1. index

    • Description: Index of the JWT configuration. This is a unique identifier for the JWT configuration.
    • Required: Yes
    • Supported Values: 0, 1, 2, 3 etc.
  2. issuer

    • Description: Issuer of the JWT Payload. This is a string identifier. The JWT tokens that contain this value in the iss field will be validated using this configuration.
    • Required: Yes
  3. subject

    • Description: Subject of the JWT Payload. This is a string identifier. The JWT tokens that contain this value in the sub field will be used to enforce access control.
    • Required: Optional
    • Sample Value: infra_test_user
  4. secret

    • Description: Secret key to validate the JWT token. This is a string identifier. JWT tokens that include this value in their secret field will be validated using this configuration. This is specifically applicable when the JWT token is signed and encrypted using the HS256 algorithm.
    • Required: Optional
    • Sample Value: mysecret
  5. userKey

    • Description: JWT Payload key for the username.
    • Required: Optional
    • Default: client_id
  6. groupKey

    • Description: JWT Payload key for the group name.
    • Required: Optional
    • Default: scope
  7. parserType

    • Description: Specifies how the scope or group is formatted. Choose one of the following values:
      • PING_IDENTITY: Use when scope/group is an array. Example:
        JSON
        1
        2
        3
        {
            "scope": ["infra_test_group"]
        }
        
      • KEYCLOAK: Use when scope/group is space separated. Example:
        JSON
        1
        2
        3
        {
            "scope": "infra_test_group1 infra_test_group2 infra_test_group3"
        }
        
    • Required: Yes
  8. audience

    • Description: Audience for whom the JWT token has been issued. This is a string identifier. The JWT tokens that contain this value in the aud field will be validated using this configuration.
    • Required: Optional

For Static public key configuration

  1. publickey
    • Description: JWT file name that you copied in previous steps. (in this case use Algorithm RS256)
    • Required: Required only for Static public Key

For Dynamic Public Key, here are the additional properties. Note that we are providing both the Privacera Manager property | PrivaceraCloud property.

  1. pubKeyProviderEndpoint | privacera.jwt.0.token.publickey.provider.url

    • Description: API URL by which we will return public key.
      • Format: https://my-sat-server/<api-to-get-public-key-by-kid>/ or https://my-sat-server/<api-to-get-public-key-by-kid>/?kid=
      • Privacera code will add <kid> at the end above URL and it will become like this https://my-sat-server/<api-to-get-public-key-by-kid>/<kid> or https://my-sat-server/<api-to-get-public-key-by-kid>/?kid=<kid>and that API should return public key of specific key id (kid) mentioned in JWT.
    • Required: Yes
  2. pubKeyProviderAuthType | privacera.jwt.0.token.publickey.provider.auth.type

    • Description: Authorization type as per API URL (BASIC/NONE)
    • Required: Optional
    • Default: NONE
  3. pubKeyProviderAuthUserName | privacera.jwt.0.token.publickey.provider.auth.username

    • Description: Username for JWKS Provider
    • Required: Required When pubKeyProviderAuthType=BASIC
  4. pubKeyProviderAuthTypePassword | privacera.jwt.0.token.publickey.provider.auth.password

    • Description: Password for JWKS Provider
    • Required: Required When pubKeyProviderAuthType=BASIC
  5. pubKeyProviderJsonResponseKey | privacera.jwt.0.token.provider.response.key

    • Description: JWKS Response JSON Key to get Public Key
    • Required: Yes
    • Default: x5c
  6. jwtTokenProviderKeyId | privacera.jwt.0.token.provider.key.id

    • Description: JWT Headers Key to get public key id to retrieve from JWKS Provider
    • Required: Yes

Configuring OLAC and FGAC Connectors

Configuring AWS EMR (on EC2) and EMR-Serverless Spark OLAC connector

Open the vars.emr.yml file:

Bash
cd ~/privacera/privacera-manager
vi config/custom-vars/vars.emr.yml

Add following property to enable JWT for EMR:

Bash
EMR_JWT_OAUTH_ENABLE: "true"

Configuring Databricks Standard Cluster OLAC connector

No additional configuration is required for Databricks Standard Cluster OLAC connector when using Self Managed, Data Plane and PrivaceraCloud deployments.

Configuring Apache Spark on EKS OLAC connector

No additional configuration is required for Databricks Standard Cluster OLAC connector when using Self Managed, Data Plane and PrivaceraCloud deployments.

Configuring Databricks High Concurrency Cluster FGAC connector

No additional configuration is required for Databricks Standard Cluster OLAC connector when using Self Managed, Data Plane and PrivaceraCloud deployments.

End to end setup

We will walk you through an end to end setup using Python script for generating JWT token and using a Python JWKS server. These utilities are for helping you do an end to end flow. These should not be used in a production environment.

Generating JWT token using Python script

Create Python virtual environment and install libraries
  1. Create a folder to store the script and Python virtual environment.

    Bash
    mkdir -p ~/privacera/privacera-jwt
    cd ~/privacera/privacera-jwt
    

  2. Create a requirements file to download libraries required by the script. These are open source libraries commonly used for signing and JWT creation

    Bash
    vi requirements.txt
    
    Add the following content to the file.
    Bash
    1
    2
    3
    4
    5
    6
    7
    cffi==1.15.1
    cryptography==40.0.2
    pycparser==2.21
    PyJWT==2.6.0
    Flask==2.1.0
    Flask-HTTPAuth==4.4.0
    Werkzeug==2.2.2
    

  3. Create a Python virtual environment. This is a one time step.

    Bash
    python3 -m venv venv
    

  4. Activate the virtual environment so that you can use the virtual environment. This step is required to be done everytime you start a new shell.

    Bash
    source venv/bin/activate
    

  5. Install the required libraries. This is a one time step.

    Bash
    pip3 install -r requirements.txt
    

  6. You can deactivate the Python virtual environment when are you done.

Bash
deactivate
And re-activate it when you resume working,
Bash
cd ~/privacera/privacera-jwt
source ./venv/bin/activate

Generate RSA 256 and EC 256 key-pairs for test purpose
  1. We are going to generate a RSA 256 key-pair and an EC 256 key-pair. These are for test purpose to show you how to use RSA and EC keys. Typically, you will be using only one type of signing algorithm in yours setup.

  2. Generate a RSA 256 key-pair. This will be used to sign the JWT token. The private key will be encrypted using a password. You need this if you want to sign the token using a RSA 256 key.

    Bash
    1
    2
    3
    4
    5
    6
    7
    8
    cd ~/privacera/privacera-jwt
    
    # encrypt the private key using known password
    openssl genrsa -des3 -out jwt-rsa256-key-encrypted.pem -passout pass:welcome1 2048 
    
    # Extract the public key which will be configured into Privacera 
    openssl rsa -in jwt-rsa256-key-encrypted.pem -outform PEM -pubout \
        -out jwt-rsa256-public.pem -passin pass:welcome1
    

  3. Generate an EC 256 key pair. This will be used to sign the JWT token. The private key will be encrypted using a password. You need this if you want to sign the token using an EC 256 key.

    Bash
    cd ~/privacera/privacera-jwt
    
    # Generate ECDSA key pair
    openssl ecparam -name prime256v1 -genkey -noout -out private.ec.key
    
    # Convert the private key to PKCS8 encrypted format using password
    openssl pkcs8 -topk8 -in private.ec.key -out jwt-ec256-key-encrypted.pem \
        -passout pass:welcome1
    
    # Extract the public key
    openssl ec -in jwt-ec256-key-encrypted.pem -pubout -out jwt-ec256-public.pem \
        -passin pass:welcome1
    rm private.ec.key
    

Create the Python script for generating JWT token
  1. Create a Python script to generate the JWT token. This script will generate a JWT token and sign it using the private key from the keypairs that we have generated. It will take a command line argument to choose the keypair to use.

    Bash
    1
    2
    3
    4
    5
    6
    cd ~/privacera/privacera-jwt
    
    # If this is a new shell window, then activate the virtual environment
    source venv/bin/activate
    
    vi privacera_jwt.py
    
    Python
    from sys import argv
    
    import jwt
    import time
    from cryptography.hazmat.primitives import serialization
    from cryptography.hazmat.backends import default_backend
    
    if len(argv) > 1 and argv[1] == "rsa":
        key_pem_file = "./jwt-rsa256-key-encrypted.pem"
        issuer = "https://idp.example.com/issuer1"
        kid = "kid_1"
    elif len(argv) > 1 and argv[1] == "ec":
        key_pem_file = "./jwt-ec256-key-encrypted.pem"
        issuer = "https://idp.example.com/issuer2"
        kid = "kid_2"
    else:
        print("Usage: python privacera_jwt.py rsa|ec")
        exit(1)
    
    # duration is 10 days
    duration_sec = 10 * 24 * 60 * 60
    expiry_epoch_sec = int(time.time()) + duration_sec
    
    # username in the token
    user_name = "infra_test_user"
    
    token = {
        "scope": [
            "infra_test_group_1",
            "infra_test_group_2",
        ],
        "iss": issuer,
        "aud": "privacera.dataserver",
        "sub": user_name,
        "iat": int(time.time()),
        "exp": expiry_epoch_sec
    }
    
    print(f"token={token}")
    
    # read the private key
    with open(key_pem_file, mode="rb") as private_file:
        pem_bytes = private_file.read()
    
    # passphrase for the private key
    passphrase = b"welcome1"
    
    private_key = serialization.load_pem_private_key(
        pem_bytes, password=passphrase, backend=default_backend()
    )
    if argv[1] == "rsa":
        encoded = jwt.encode(token, private_key, algorithm="RS256", headers={"kid": kid})
        print(f"encoded value using RSA 256 key:\n{encoded}")
    elif argv[1] == "ec":
        encoded = jwt.encode(token, private_key, algorithm="ES256", headers={"kid": kid})
        print(f"encoded value using EC 256 key:\n{encoded}")
    

  2. Run the script and copy the JWT token that is printed. You can generate the token using RSA or EC key by passing the argument to the script.

    Bash
    python3 privacera_jwt.py rsa
    
    Bash
    python3 privacera_jwt.py ec
    
    Use the encoded value from the output of the script as the JWT_TOKEN in the EMR, Databricks or Apache Spark cluster

  3. Static key configuration Copy the public key files to the Privacera Manager configuration directory.

    Bash
    1
    2
    3
    4
    5
    cp ~/privacera/privacera-jwt/jwt-rs256-public.pem \
        ~/privacera/privacera-manager/config/custom-properties
    
    cp ~/privacera/privacera-jwt/jwt-ec256-public.pem \
        ~/privacera/privacera-manager/config/custom-properties
    

Configure static key JWT configuration in Privacera Manager or PrivaceraCloud
  1. Configure the static JWT configuration in Privacera

Use the following properties in the vars.jwt-auth.yaml file to configure the public keys for Self Managed and Data Plane.

YAML
JWT_CONFIGURATION_LIST:

  - index: 0
    issuer: "https://idp.example.com/issuer1"
    audience: "privacera.dataserver"
    userKey: "sub"
    groupKey: "scope"
    parserType: "PING_IDENTITY"
    publickey: "jwt-rs256-public.pem"
    audience: "privacera.dataserver"

  - index: 1
    issuer: "https://idp.example.com/issuer2"
    audience: "privacera.dataserver"
    userKey: "sub"
    groupKey: "scope"
    parserType: "PING_IDENTITY"
    publickey: "jwt-ec256-public.pem"
Run Privacera Manager and after Privacera Dataserver restarts, you can test the JWT configuration by using the instructions in the next section. You can configure OLAC plugins and test the end to end flow. You can configure the FGAC plugin using the FGAC configuration generated by Privacera Dataserver to test the end to end flow.

Use the following properties in the s3 application in PrivaceraCloud.

Properties
privacera.jwt.oauth.enable=true

privacera.jwt.0.token.issuer=https://idp.example.com/issuer1
privacera.jwt.0.token.publickey=<public_key_in_string_format>
privacera.jwt.0.token.userKey=sub
privacera.jwt.0.token.groupKey=scope
privacera.jwt.0.token.parserType=PING_IDENTITY
privacera.jwt.0.token.audience=privacera.dataserver

privacera.jwt.1.token.issuer=https://idp.example.com/issuer2
privacera.jwt.1.token.publickey=<public_key_in_string_format>
privacera.jwt.1.token.userKey=sub
privacera.jwt.1.token.groupKey=scope
privacera.jwt.1.token.parserType=PING_IDENTITY
privacera.jwt.0.token.audience=privacera.dataserver

Test JWT configuration in Privacera Dataserver
  1. Now you should be able to test OLAC using the JWT token and Dataserver endpoint by using the following curl command. This is available for Self Managed and Data Plane in Privacera 9.3.0.1 onwards. Not available on PrivaceraCloud.
    Bash
    1
    2
    3
    4
    curl -v -X POST \
    -H 'Content-Type: application/json' \
    -d '{"tokenStr": "<your-jwt>"}' \
    https://DATASERVER_URL_IN_YOUR_ENV/services/jwt/validate
    
    The response will be a JSON object with the status of the token validation.

Http Status code

Text Only
1
2
200 - if able to process request successfully
400 - if the payload is empty in request OR the token value is empty in the request

Content type: application/json

Response format for Valid Token

JSON
1
2
3
4
5
6
7
{
  "statusCode": 0,
  "isValid": true,
  "expiresOn": "<Date>",
  "algorithm": "<algorithm-type>",
  "message": "<message>"
}

Response format for Invalid Token

JSON
1
2
3
4
5
{
  "statusCode": 1,
  "isValid": false,
  "message": "<message>"
}

Response format if empty payload OR empty jwt token

JSON
1
2
3
4
5
{
  "statusCode": 2,
  "isValid": false,
  "message": "<message>"
}

Configure and test OLAC or FGAC plugin using the generated JWT token
  1. You can configure the OLAC and FGAC plugins and verify using the generated JWT token. You will need to have user named 'infra_test_user' in Privacera and create Access policies for that user in privacera_s3 service repo for OLAC plugin, and in privacera_hive service repo for FGAC plugin.
Create Python script to run as JWKS server
  1. For testing the dynamic public key configuration, you can use the following script to serve the public key using JWKS endpoint.

    Bash
    1
    2
    3
    4
    5
    6
    cd ~/privacera/privacera-jwt
    
    # If this is a new shell window, then activate the virtual environment
    source venv/bin/activate
    
    vi jwks_server.py
    
    Paste the following code in the file.
    Python
    import base64
    import json
    
    from datetime import datetime, timedelta
    from cryptography.hazmat.backends import default_backend
    from cryptography.hazmat.primitives import serialization, hashes
    import cryptography.hazmat.primitives.asymmetric.rsa as rsa
    import cryptography.hazmat.primitives.asymmetric.ec as ec
    
    from flask import Flask, request, jsonify
    from flask_httpauth import HTTPBasicAuth
    
    
    def compute_sha256_thumbprint(public_key_pem):
        # Load the PEM encoded public key
        public_key = serialization.load_pem_public_key(
            public_key_pem,
            backend=default_backend()
        )
    
        # Compute SHA-256 hash of the DER encoding of the public key
        der_encoding = public_key.public_bytes(
            encoding=serialization.Encoding.DER,
            format=serialization.PublicFormat.SubjectPublicKeyInfo
        )
        sha256_hash = hashes.Hash(hashes.SHA256(), backend=default_backend())
        sha256_hash.update(der_encoding)
        thumbprint = sha256_hash.finalize()
    
        return thumbprint
    
    
    def int_to_base64(n):
        # Determine the number of bytes required to represent the integer
        num_bytes = (n.bit_length() + 7) // 8
    
        # Convert the integer to bytes
        int_bytes = n.to_bytes(num_bytes, byteorder='big')
    
        # Encode the bytes using base64
        base64_bytes = base64.b64encode(int_bytes)
    
        # Convert the base64 bytes to a string
        base64_string = base64_bytes.decode('utf-8')
    
        return base64_string
    
    
    def get_response(kid, pem_file_path, expiry_time_delta):
        # Read the public key from a PEM file
        with open(pem_file_path, "rb") as pem_file:
            public_key_pem = pem_file.read()
    
        # Remove BEGIN and END headers and footers
        pem_lines = public_key_pem.decode('utf-8').split('\n')
        pem_contents = (''.join(pem_lines)
                        .replace('-----BEGIN PUBLIC KEY-----', '')
                        .replace('-----END PUBLIC KEY-----', ''))
    
        # Load the PEM encoded public key
        public_key = serialization.load_pem_public_key(
            public_key_pem,
            backend=default_backend()
        )
    
        print(f"type={type(public_key)}")
    
        # Extract SHA-256 thumbprint
        thumbprint = compute_sha256_thumbprint(public_key_pem)
        thumbprint_hex = thumbprint.hex()
    
        print("PEM File (Single String without Headers and Footers):")
        print(pem_contents)
    
        # Check the type of the public key
        if isinstance(public_key, rsa.RSAPublicKey):
    
            # Extract modulus and exponent from the public key
            modulus = public_key.public_numbers().n
            exponent = public_key.public_numbers().e
    
            response = {
                "kty": "RSA",
                "use": "sig",
                "e": int_to_base64(modulus),
                "n": int_to_base64(exponent),
                "x5t#S256": thumbprint_hex,
                "x5c": [pem_contents],
                "kid": kid,
                "exp": (datetime.utcnow() + expiry_time_delta).timestamp(),
            }
    
            print(json.dumps(response, indent=4))
            return response
    
        elif isinstance(public_key, ec.EllipticCurvePublicKey):
            ec_numbers = public_key.public_numbers()
    
            # Extract x and y coordinates
            x = ec_numbers.x
            y = ec_numbers.y
    
            response = {
                "kty": "EC",
                "use": "sig",
                "x": int_to_base64(x),
                "y": int_to_base64(y),
                "x5t#S256": thumbprint_hex,
                "x5c": [pem_contents],
                "kid": kid,
                "exp": (datetime.utcnow() + expiry_time_delta).timestamp(),
            }
            return response
        else:
            return "Unknown"
    
    
    def main():
        USERNAME = 'admin'
        PASSWORD = 'Welcome@123'
    
        kid_dict = {
            'kid_1': 'jwt-rsa256-public.pem',
            'kid_2': 'jwt-ec256-public.pem'
        }
    
        app = Flask(__name__)
        auth = HTTPBasicAuth()
    
        # Verify username and password for basic authentication
        @auth.verify_password
        def verify_password(username, password):
            return username == USERNAME and password == PASSWORD
    
        # GET API to retrieve public key by kid
        @app.route('/get_public_key', methods=['GET'])
        @auth.login_required
        def get_public_key():
            kid = request.args.get('kid')
            public_key = kid_dict.get(kid)
    
            if public_key is None:
                return jsonify({'error': 'Public key not found'}), 404
            else:
                return get_response(kid, public_key, timedelta(days=30))
    
        app.run(host="0.0.0.0", port=9090, debug=True)
    
    
    if __name__ == "__main__":
        print("starting main")
        main()
        print("ending main")
    
    Start the server and keep it running.
    Bash
    python3 jwks_server.py
    

  2. You can test this endpoint using a curl command as follows, from another shell,

    Bash
    curl -u 'admin:Welcome@123' 'http://localhost:9090/get_public_key?kid=kid_1'
    
    JSON
    {
      "keys": [
        {
          "e": "rTfaKZ...AgkWaJ0lbKSGfQ==",
          "exp": 1724306625,
          "kid": "kid_1",
          "kty": "RSA",
          "n": "AQAB",
          "use": "sig",
          "x5c": [
            "MIIBIjANBgkq...AQAB"
          ],
          "x5t#S256": "ebdef...21d279f863"
        }
      ]
    }   
    
    Bash
    curl -u 'admin:Welcome@123' 'http://localhost:9090/get_public_key?kid=kid_2'
    
    JSON
    {
      "keys": [
        {
          "exp": 1724306623,
          "kid": "kid_2",
          "kty": "EC",
          "use": "sig",
          "x": "QQWKgcN...CEahadqHgsc=",
          "x5c": [
            "MFkwEwYHKo...uknLaYRtXQw=="
          ],
          "x5t#S256": "b7d582f...91d88a07e0",
          "y": "mQL/B2CBV...iLpJy2mEbV0M="
        }
      ]
    }
    

Configure Dynamic public key using the Python JWKS server endpoint
  1. You can now configure the dynamic public key configuration in Privacera. You have to remove the private key files of the static key from the ~/privacera/privacera-manager/config/custom-properties folder as the JWKS endpoint will be used by Privacera Dataserver and Databricks FGAC plugin to get the public key to verify the JWT signature.

Use the following properties in the vars.jwt-auth.yaml file to configure the public keys for Self Managed and Data Plane.

YAML
JWT_CONFIGURATION_LIST:

  - index: 0
    issuer: "https://idp.example.com/issuer1"
    audience: "privacera.dataserver"                
    userKey: "sub"
    groupKey: "scope"
    parserType: "PING_IDENTITY"
    pubKeyProviderEndpoint: "http://JWKS_HOST_IP:9090/get_public_key?kid="
    pubKeyProviderAuthType: "BASIC"
    pubKeyProviderAuthUserName: "admin"
    pubKeyProviderAuthTypePassword: "Welcome@123"
    pubKeyProviderJsonResponseKey: "x5c"
    jwtTokenProviderKeyId: "kid"


  - index: 1
    issuer: "https://idp.example.com/issuer2"
    audience: "privacera.dataserver"
    userKey: "sub"
    groupKey: "scope"
    parserType: "PING_IDENTITY"
    pubKeyProviderEndpoint: "http://JWKS_HOST_IP:9090/get_public_key?kid="
    pubKeyProviderAuthType: "BASIC"
    pubKeyProviderAuthUserName: "admin"
    pubKeyProviderAuthTypePassword: "Welcome@123"
    pubKeyProviderJsonResponseKey: "x5c"
    jwtTokenProviderKeyId: "kid"

use the following properties in the s3 application in PrivaceraCloud.

Properties
privacera.jwt.oauth.enable=true

privacera.jwt.0.token.issuer=https://idp.example.com/issuer1
privacera.jwt.0.token.userKey=sub
privacera.jwt.0.token.groupKey=scope
privacera.jwt.0.token.parserType=PING_IDENTITY
privacera.jwt.0.token.audience=privacera.dataserver
privacera.jwt.0.token.publickey.provider.url: "http://JWKS_HOST_IP:9090/get_public_key?kid="
privacera.jwt.0.token.publickey.provider.auth.type: "BASIC"
privacera.jwt.0.token.publickey.provider.auth.username: "admin"
privacera.jwt.0.token.publickey.provider.auth.password: "Welcome@123"
privacera.jwt.0.token.provider.response.key: "keys"
privacera.jwt.0.token.provider.key.id: "kid"

privacera.jwt.1.token.issuer=https://idp.example.com/issuer2
privacera.jwt.1.token.publickey=<public_key_in_string_format>
privacera.jwt.1.token.userKey=sub
privacera.jwt.1.token.groupKey=scope
privacera.jwt.1.token.parserType=PING_IDENTITY
privacera.jwt.1.token.audience=privacera.dataserver
privacera.jwt.1.token.publickey.provider.url: "http://JWKS_HOST_IP:9090/get_public_key?kid="
privacera.jwt.1.token.publickey.provider.auth.type: "BASIC"
privacera.jwt.1.token.publickey.provider.auth.username: "admin"
privacera.jwt.1.token.publickey.provider.auth.password: "Welcome@123"
privacera.jwt.1.token.provider.response.key: "keys"
privacera.jwt.1.token.provider.key.id: "kid"

After Privacera Manager has been run so that it restarts the Privacera Dataserver, you can use the dataserver test endpoint as given above. You can then test the OLAC plugin. Similarly, you will have to upload the newly generated FGAC plugin configuration and us it to test the FGAC plugin.

Comments