Privacera Spark Plug-in versus Open-source Spark Plug-in#
The following table illustrates how Privacera Spark plug-in is better optimized than open-source Spark plug-in in a Kubernetes environment to perform fine-grained and object-level access control.
Privacera Spark Plug-in |
Open-source Spark Plug-in |
|
---|---|---|
Fine-grained access control (FGAC) |
|
|
Object-level access control (OLAC) |
|
|
Audits |
|
|
Support |
|
|
Fine-grained access control#
Privacera Spark plug-in does access control both at the file level and table/column level with fine-grained access control. Fine-grained access control assumes the EKS clusters nodes have the IAM role setup that can access S3 objects. Appropriate policies in Privacera Cloud can be set up at the file level/table level to control access. As long as the requests are in the Spark context, the access control will work with Spark plug-in.
Limitations#
Though the plug-in does access control on all Spark related jobs, there are certain places where the plug-in cannot do access control:
-
S3 requests from outside of Spark context.
-
Since the IAM role needs to be given to the EKS nodes, it opens up the ability for unauthorized users to bypass Ranger security and access S3 and other AWS resources by using Python Boto library or other custom jar libraries.
Object-level Access Control#
Object-level access control does only access control on the files/objects on S3 whether it is accessed through Spark jobs or outside of Spark jobs. It requires Data access server and S3 setup on EKS. OLAC is only supported with a Data access server on Privacera Platform (on-prem). Data server utilizes Signed URL concept to provide access to S3 objects. OLAC also supports access control for requests outside of Spark context using Python Boto Library or third party custom libraries.