JWT Token User Identity¶
Overview¶
This feature allows the use of JWT tokens to carry the user identity information required by Privacera to enforce access control. This works for certain connectors or use-cases where the data source may not be able to pass the user identity reliably to Privacera.
Connectors¶
The following connectors support the use of JWT tokens to carry user identity information:
- OLAC connectors
- AWS EMR (on EC2) Spark OLAC connector without Kerberos - JWT token user identity is the only supported way to enforce access control in a non-Kerberos EMR (on EC2) cluster.
- AWS EMR-Serverless Spark OLAC connector without Lake Formation - JWT token user identity allows you to use Privacera for access control in a non-Lake Formation EMR-Serverless cluster without using IAM roles for user identity.
- Databricks Standard Cluster OLAC connector - JWT token user identity is an additional way to pass user identity to Privacera for access control if you don't want to use the logged-in user identity.
- Apache Spark on EKS OLAC connector - JWT token user identity is the only supported way to enforce access control in Apache Spark on EKS cluster.
- FGAC connectors
- Databricks High Concurrency Cluster FGAC connector - JWT token user identity is an additional way to pass user identity to Privacera for access control if you don't want to use the logged-in user identity.
Supported Deployments¶
- PrivaceraCloud
- Self Managed Deployment
- PrivaceraCloud Data-plane Deployment
Prerequisites¶
You need to have a JWT token generation capability in your identity provider (IdP) to generate the JWT token. The JWT token is signed by your IdP and contains the user identity information. The user is configured in Privacera with the same username. The public key of the IdP is used to validate the JWT token. It is either configured statically in Privacera or provided dynamially through a JWKS endpoint which is configured in Privacera.
For OLAC use-case, you need to have Privacera Dataserver configured and running, to which we will add the additional configuration to validate JWT token.
Sample Flow for OLAC¶
sequenceDiagram
participant User
participant IdentityProvider
participant ComputeEnv as Compute Env + Privacera OLAC Plugin
participant PrivaceraDataServer
participant CloudStorage
User->>IdentityProvider: 1. Request JWT token
IdentityProvider-->>User: 2. Provide JWT token
User->>ComputeEnv: 3. Pass JWT token
ComputeEnv->>PrivaceraDataServer: 4. Send JWT token
PrivaceraDataServer->>PrivaceraDataServer: 5. Validate JWT token using static key or key from JWK endpoint
PrivaceraDataServer->>PrivaceraDataServer: 6. Generate Signed URL/STS token
PrivaceraDataServer-->>ComputeEnv: 7. Provide Signed URL/STS token
ComputeEnv->>CloudStorage: 8. Access data using Signed URL/STS token
CloudStorage-->>ComputeEnv: 9. Data retrieved
Diagram Explanation
- Request JWT Token: The user requests a JWT token from the Identity Provider (IdP).
- Provide JWT Token: The IdP provides the JWT token to the user.
- Pass JWT Token: The user passes the JWT token to the compute environment.
- Send JWT Token: The compute environment sends the JWT token to Privacera DataServer.
- Validate JWT Token: Privacera DataServer validates the JWT token signature by using either IdP public key that is statically configured or is obtained dynamically from IdP's JWKS endpoint.
- Generate Signed URL/STS Token: Privacera DataServer generates a Signed URL or STS token.
- Provide Signed URL/STS Token: Privacera DataServer provides the Signed URL or STS token to the compute environment.
- Access Data: The compute environment accesses data from cloud storage using the Signed URL or STS token.
- Data Retrieved: The data is retrieved from the cloud storage and provided to the compute environment.
Sample Flow for FGAC¶
sequenceDiagram
participant User
participant IdentityProvider
participant ComputeEnv as Compute Env + Privacera FGAC Plugin
participant CloudStorage
User->>IdentityProvider: 1. Request JWT token
IdentityProvider-->>User: 2. Provide JWT token
User->>ComputeEnv: 3. Pass JWT token
ComputeEnv->>ComputeEnv: 4. Privacera FGAC plugin validates JWT token using static key or key from JWK endpoint
ComputeEnv->>ComputeEnv: 5. Privacera FGAC plugin uses identity to enforce access control
ComputeEnv->>CloudStorage: 6. Access data using Compute Env native permissions (IAM role)
CloudStorage-->>ComputeEnv: 7. Data retrieved
Diagram Explanation
- Request JWT Token: The user requests a JWT token from the Identity Provider (IdP).
- Provide JWT Token: The IdP provides the JWT token to the user.
- Pass JWT Token: The user passes the JWT token to the compute environment.
- Validate JWT Token: Privacera FGAC plugin validates the JWT token signature by using either IdP public key that is statically configured or is obtained dynamically from IdP's JWKS endpoint.
- Enforce Access Control: Privacera FGAC plugin uses the user identity to enforce access control.
- Access Data: The compute environment accesses data from cloud storage using the compute environments native permissions (IAM role).
- Data Retrieved: The data is retrieved from the cloud storage and provided to the compute environment.
Concepts¶
JWT Token Format¶
A JSON Web Token (JWT) consists of three Base64 strings separated by dots (.
). These 3 parts are header, payload and signature. The header and payload are JSON objects, and the signature is a computed over the header and payload using a secret key. The signature is used to confirm the identity of the issuer and the integrity of the JWT token.
The header contains the algorithm used to sign the JWT token. An example JWT header JSON is shown below. All the values are examples and should not be used as is.
The fields in the header are as follows,
- The
alg
field is the algorithm used to sign the JWT token. Privacera supports only RSA256 and ECDSA256 algorithms for JWT token signature, which correspond to RS256 and ES256 as values of this field. - The
typ
field is the type of the token. This is a literal value and is alwaysJWT
. - The
kid
field is the key id of the public key used to sign the JWT token. This is an optional field. It is present if JWKS endpoint is used to fetch the public key.
The payload contains the claims. An example JWT payload JSON is shown below. All the values are examples and should not be used as is.
JSON | |
---|---|
- The
iss
field is the issuer of the JWT token. This value is configured in Privacera so that it can be used to obtain the configuration for validating the JWT token. This is a mandatory field. Typically, it is in the format of a URL, but it is a literal value and no connection attempt will be made to this URL. - The
sub
field is the subject of the JWT token. This is the user identity that Privacera should use to enforce access control. This is a mandatory field. You can configure another key in the payload to be used as the user identity. - The
iat
field is the issued at time of the JWT token. This is the time when the token was issued in Unix time. This is a mandatory field. The token is rejected if current time is before this time. - The
exp
field is the expiration time of the JWT token. This is the expiry time of the token in Unix time. This is a mandatory field. The token is rejected if the current time is after this time. - The
aud
field is the audience of the JWT token. This is the intended recipient of the token, which is Privacera Dataserver. This is a string that is configured in Privacera and Privacera will use the token only if it matches. This is an optional field. - The
scope
field is used to carry additional list of groups. This is an optional field. You can configure another key in the payload to be used as the group list. The groups can be either space separated or comma separated. These groups can be used to override the user's groups that are configured in Privacera or to add additional groups. TODO: need the properties
All other fields in the payload will be ignored by Privacera.
Token Duration
For OLAC jobs, the token duration can be short as it is used only during the startup of the job to pass the identity to the Privacera Dataserver. For FGAC jobs, the token duration should be long enough to cover the duration of the job.
JWT Signature Verification¶
JWT Signature verification is done using the public key of the IdP. The public key can be configured statically in Privacera or dynamically fetched from the IdP's JWKS endpoint.
Privacera supports only RSA256 and ECDSA256 algorithms for JWT token signature.
In case of dynamic public key configuration, the public key is fetched from the IdP's JWKS (JSON Web Key Set) endpoint using the kid
field in the JWT header. The JWKS service returns a set of keys containing public keys used to verify the JWT token. The endpoint could return a set of keys or one specific key given the kid
field in the JWT header.
Here is an example of JWKS with RSA JWK returned by the JWKS service -
JSON | |
---|---|
alg
- Algorithm used to sign the JWT token. It is either RS256 or HS256.kty
- Key type. It is either RSA or EC.use
- Use of the key. It is either sig or enc.x5c
- X.509 certificate chain. It is an array of base64 encoded X.509 certificates.n
- RSA modulus. It is a base64 encoded string.e
- RSA exponent. It is a base64 encoded string.kid
- Key ID. It is a string identifier.x5t#S256
- X.509 certificate SHA-1 thumbprint. It is a base64 encoded string.exp
- Expiration time of the key. It is a Unix time.
Here is an example of JWKS with ECDSA JWK returned by the JWKS service -
JSON | |
---|---|
kty
- Key type. It is either RSA or EC.crv
- Curve used for the key. It is a string.x
- X coordinate of the key. It is a base64 encoded string.y
- Y coordinate of the key. It is a base64 encoded string.kid
- Key ID. It is a string identifier.exp
- Expiration time of the key. It is a Unix time.
The Privacera Dataserver will obtain the key from the JWKS service endpoint when a JWT token with key id is received. This key will be cached for it's expiration duration.
Using JWT Token User Identity Feature in Privacera¶
To use this feature you need to do the following:
- For OLAC supported connectors
- Configure Privacera Dataserver to use JWT tokens
- Configure EMR, Databricks or Apache Spark plugin to use JWT token
- At runtime, generate JWT token and pass it to the Spark job
- For FGAC supported connectors
- Configure the Databricks Spark plugin to use JWT token
- At runtime, generate JWT token and pass it to the Spark job