- Platform Release 6.5
- Privacera Platform Installation
- Privacera Platform User Guide
- Privacera Discovery User Guide
- Privacera Encryption Guide
- Privacera Access Management User Guide
- AWS User Guide
- Overview of Privacera on AWS
- Configure policies for AWS services
- Using Athena with data access server
- Using DynamoDB with data access server
- Databricks access manager policy
- Accessing Kinesis with data access server
- Accessing Firehose with Data Access Server
- EMR user guide
- AWS S3 bucket encryption
- Getting started with Minio
- Plugins
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- Privacera Platform documentation changelog
EMR user guide
Get started
Create bucket ${SECURE_BUCKET_NAME} which you need to protect.
Download sample data from the following link and add it to your bucket at location (s3://${SECURE_BUCKET_NAME}/sample_data/customer_data)
wget https://privacera-demo.s3.amazonaws.com/data/uploads/customer_data_clear/customer_data_without_header.csv
Make sure cluster should not have direct access on ${SECURE_BUCKET_NAME} bucket.
To verify, run the following commands:
ssh -i ${KEY_FILE} hadoop@${EMR_PUBLIC_DNS} aws s3 ls s3://${SECURE_BUCKET_NAME}
Result: Fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
Hive
Run the below in beeline, using an admin user who has permission on url in Ranger and also has permission to create tables and databases:
beeline -u "jdbc:hive2://`hostname -f`:10000/default;principal=hive/`hostname -f`@${REALM}"
Create the table using this user, by running the following command in Hive.
create database if not exists customer; use customer; CREATE EXTERNAL TABLE if not exists `customer_data_s3`( `id` string, `global_id` string, `name` string, `ssn` string, `email_address` string, `address` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3a://${SECURE_BUCKET_NAME}/sample_data/customer_data';
Exit from beeline.
Switch to ${TEST_USER} and kinit and try sample policy.
beeline -u "jdbc:hive2://`hostname -f`:10000/default;principal=hive/`hostname -f`@${REALM}" #Check ranger audit for hive service Select * from customer.customer_data_s3 LIMIT 10;
Data_Admin access
Prerequisites
To create a view using the Hive Plug-In, you need the DATA_ADMIN permission in Ranger.
The source table on which you are going to create a view requires the DATA_ADMIN Ranger policy.
Use case
This use case starts with an 'employee_db' database containing two tables with the following data:
#Requires create privilege on the database enabled by default; create database if not exists employee_db;
Create two tables.
#Requires create privilege on the table level; create table if not exists employee_db.employee_data(id int,userid string,country string); create table if not exists employee_db.country_region(country string,region string);
Insert test data:
#Requires update privilege on the table level; insert into employee_db.country_region values ('US','NA'), ('CA','NA'), ('UK','UK'), ('DE','EU'), (','EU'); insert into employee_db.employee_data values (1,'james','US'),(2,'john','US'), (3,'mark','UK'), (4,'sally-sales','UK'),(5,'sally','DE'), (6,'emily','FR'DE');
Create the following policy:
Policy Type: Access Policy
Name: Create View Access
Database: employee_db
Table: employee_data, country_region
Column: *
Select User: emily
Permissions: select, update, Create
Run queries on the data:
SELECT * FROM employee_db.country_region; #Requires select privilege on the column level; SELECT * FROM employee_db.employee_data; #Requires select privilege on the column level;
Create the following view:
createviewemployee_db.employee_region(userid,region)asselecte.userid,cr.regionfromemployee_db.employee_datae,employee_db.country_regioncrwheree.country=cr.country;
Note
Granting Data_admin privileges on the resource implicitly grants Select privilege on the same resource as well.
Run the same queries as in Step 3. You will see an error message like the following:
Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [emily] does not have [DATA_ADMIN] privilege on [employee_db/employee_data](state=42000,code=40000)
Now create the following policy.
Policy Type: Access Policy
Name: Create View Access
Database: employee_db
Table: employee_region
Column: *
Select Group: group_privacera_dev
Permissions: select, Create
Execute the queries from Step 3. They should execute properly.
Alter view
create view if not exists employee_db.employee_region(userid,region) as select e.userid, cr.region from employee_db.employee_data e, employee_db.country_region cr where e.country = cr.country;
#Requires Create permission on the view; ALTER VIEW employee_db.employee_region AS select e.userid, cr.region from employee_db.employee_data e,employee_db.country_region cr where e.country=cr.country;
Rename view
#Requires Alter permission on the view; ALTER VIEW employee_db.employee_region RENAME to employee_db.employee_region_renamed;
Drop view
#Requires Drop permission on the view; DROP VIEW employee_db.employee_region_renamed;
Row level filter
SELECT * FROM employee_db.employee_region;
Column masking
SELECT * FROM employee_db.employee_region;
PrestoDB
SSH to EMR on master node.
Start Presto shell (presto, spark-thrift, hive all three using same metastore) by entering one of the following commands:
presto-cli --catalog hive
/usr/lib/presto/bin/presto-cli-0.210-executable --server localhost:8889 --catalog hive --schema default
Attempt the following use case as a test.
CREATE SCHEMA customer WITH (location='s3a://${SECURE_BUCKETNAME}/presto_data/customer/'); USEcustomer; CREATE TABLE cust_data( EMP_SSNvarchar, CC varchar, FIRST_NAME varchar, LAST_NAME varchar, ADDRESS varchar, ZIPCODE varchar, EMAIL varchar, US_PHONE_FORMATTED varchar); INSERT INTO cust_data values ('12345','6789','Will','Smith','US','400098','ws@gmail.com','010-564-333'); SELECT * FROM cust_data;
Full Table Access.
#Add policy in ranger to access everything in the table SELECT * FROM cust_data;
Restricted Column Access.
#Column level permission in table. If User doesn't have permission to “first_name” column #Will be denied in audit select first_name from cust_data; #Will be allowed in audit select last_name, address, zipcode, email from cust_data;
Enable additional operations on Hive catalog by updating
hive.properties
. By default, PrestoDB blocks the operations. For more information, see Hive Security Configuration.Edit
hive.properties
.sudo vi /etc/presto/conf/catalog/hive.properties
Add the following properties:
connector.name=hive-hadoop2 hive.allow-drop-table=true hive.allow-rename-table=true hive.allow-add-column=true hive.allow-rename-column=true hive.allow-drop-column=true hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml hive.s3-file-system-type=EMRFS hive.hdfs.impersonation.enabled=false
Note
The
hive.properties
file needs to be updated on all the EMR nodes.Restart Presto.
sudo systemctl restart presto-server
Alternatively, you can include the properties while creating the EMR itself in the CloudFormation Template. Below is the sample JSON:
{"Classification":"presto-connector-hive","ConfigurationProperties":{"hive.metastore":"glue","hive.allow-drop-table":"true","hive.allow-add-column":"true","hive.allow-rename-column":"true","connector.name":"hive-hadoop2","hive.config.resources":"/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml","hive.s3-file-system-type":"EMRFS","hive.hdfs.impersonation.enabled":"false","hive.allow-drop-column":"true","hive.allow-rename-table":"true"}}
PrestoSQL
Start PrestoSQL shell.
presto-cli --catalog hive
Create the schema with an admin/superuser.
CREATE SCHEMA customer WITH (location='s3a://${SECURE_BUCKETNAME}/presto_data/schema/customer’); USE customer;
Create the table using admin/superuser.
USE customer; CREATE TABLE customer_data( id varchar, name varchar, ssn varchar, email_address varchar, address varchar) WITH( format='textfile', external_location='s3a://${SECURE_BUCKETNAME}/presto_data/table/customer_data' );
Exit the Presto-CLI and switch to
{TEST_USER}
, thenkinit
and try a sample policy.presto-cli --catalog hive use customer; SELECT * FROM customer_data LIMIT 10;
Data_Admin Access
Prerequisites
To create a view using the Presto SQL Plug-In, you need the DATA_ADMIN permission in Ranger.
The source table on which you are going to create a view requires the DATA_ADMIN Ranger policy.
Use Case
Create a database called employee_db
with two tables containing the following data:
#Requires create privilege on the database enabled by default; create schema if not exists employee_db;
In Privacera Portal select Access Management, then from the list of resource policy groups select privacera_hive which is under SQL.
Add a new policy:
Policy Type: Access
Policy Name: Employee Schema Create Permission
Database: employee_db
Table: *
Column: *
Select User: presto
Permissions: Create
Click SAVE.
Create two tables.
#Requires create privilege on the table level; CREATE TABLE IF NOT EXISTS employee_db.employee_data(id int, userid string, country string); CREATE TABLE IF NOT EXISTS employee_db.country_region(country string, region string);
In Privacera Portal, create the following policy:
Policy Type: Access
Policy Name: Employee Table Create Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Select User:
presto
Permissions:
Create
Insert test data.
#Requires update privilege on the table level; insert into employee_db.country_region values ('US','NA'), ('CA','NA'), ('UK','UK'), ('DE','EU'), ('FR','EU'); insert into employee_db.employee_data values (1,'james','US'),(2,'john','US'), (3,'mark','UK'), (4,'sally-sales','UK'),(5,'sally','DE'), (6,'emily','DE');
In Privacera Portal, create the following policy:
Policy Type: Access
Policy Name: Employees Table Update Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Select User:
presto
Permissions:
update
,Create
Create a view for the previoustwo tables created; you will get an error:
Query 20210223_051227_00005_nyxtw failed: Access Denied: Cannot create view tbl_view_5
You need
Create View
permission.Create the following policy:
Policy Type: Access
Policy Name: Employees Create View Permission
Database:
employee_db
Table:
tbl_view_1
Select User:
presto
Permissions:
Create
After granting
Create View
permission, the query will result in the following error message:Query 20210223_050930_00004_nyxtw failed: Access Denied: User [emily] does not have [DATA_ADMIN] privilege on [hive/employee_db/employee_data]
You need to grant
Data_Admin
permission for both tables.Create the following policy:
Policy Type: Access
Policy Name: Employees Create View Permission Data_admin
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Select User:
presto
Permissions:
update
,Create
,Data_admin
Note
Granting Data_admin privileges on the resource implicitly grants Select privilege on the same resource as well.
Run the query again. It should be successful.
Alter view
Create view
presto:customer> create view tbl_view_1 as SELECT * FROM tbl_1; CREATE VIEW presto:customer> SELECT * FROM tbl_view_1; c0 | c1 | c2 | c3 | c4 ----+--------+-----------+-----------------------+------------------------ 2 | James | 892821225 | james@walt.com | 4578 Extension xxx 1 | Dennis | 619821225 | thomasashley@walt.com | 9478 Anthony Extension 3 | Sally | 092341225 | sally@walt.com | 5678 Extension xyxx (3 rows) Query 20210303_142252_00006_g76nu, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 1.86 [3 rows, 169B] [1 rows/s, 91B/s]
Alter view
presto:customer> CREATE OR REPLACE VIEW tbl_view_1 as SELECT * FROM tbl_3; CREATE VIEW presto:customer> SELECT * FROM tbl_view_1; slno | name | mobile | email | address ------+------+--------+---------+--------- 1 | emily | 1234 | s@s.com | in (1 row) Query 20210303_142341_00009_g76nu, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.91 [1 rows, 0B] [1 rows/s, 0B/s]
Rename view
presto:customer> alter view tbl_view_1 rename to tbl_view_2; RENAME VIEW presto:customer>
Drop view
presto:customer> drop view tbl_view_1; DROP VIEW presto:customer>
Row level filter
presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_060602_00022_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 4.11 [6 rows, 0B] [1 rows/s, 0B/s]
In Privacera Portal, set the Policy Detail:
Policy Type: Row Level Filter
Policy Name: Employee Row Level Filter
Hive Database:
employee_db
Hive Table:
employee_data
,country_region
,tbl_1
Column:
*
Under Row Level Conditions:
Select User:
presto
Permissions:
select
Row Level Filter:
country='US'
presto:employee_db> presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+--------+--------- 1 | james | US 2 | john | US (2 rows) Query 20210309_061202_00024_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.45 [6 rows, 0B] [13 rows/s, 0B/s]
Column masking
presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_062000_00027_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.30 [6 rows, 0B] [20 rows/s, 0B/s]
In Privacera Portal, the Policy Detail:
Policy Type: Masking
Policy Name: Employee Column Level Masking
Hive Database:
employee_db
Hive Table:
employee_data
,country_region
Hive Column:
tbl_1
Under Masking Conditions:
Select User:
presto
Permissions:
select
Select Masking Option:
Nullify
presto:employee_db> presto:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | NULL 2 | john | NULL 3 | mark | NULL 4 | sally-sales | NULL 5 | sally | NULL 6 | emily | NULL (6 rows) Query 20210309_061856_00026_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.32 [6 rows, 0B] [18 rows/s, 0B/s]
Access views in AWS Athena
Use the following steps to provide access for views created in AWS Athena. As a result, you will be able to query the views.
Copy the Hive catalog properties (or create a symlink) as
awsdatacatalog.properties
in /etc/presto/conf/catalog folder.xln -s /etc/presto/conf/catalog/hive.properties /etc/presto/conf/catalog/awsdatacatalog.properties
Restart the Presto server.
sudo systemctl restart presto-server
In Access Management > Resource Policies, update the
privacera_hive
default policy.Edit
all - database, table
policy.In Select User, add 'Presto' from the dropdown as the default view owner, and save.
(Optional) To change the default view owner from 'Presto' to any other owner such as 'Hadoop':
In the
access-control.properties
file, add the owner to theranger.policy.authorization.viewowner.default
variable.vi /usr/lib/presto/etc/access-control.properties ranger.policy.authorization.viewowner.default=<view-owner>
Restart the Presto server.
sudo systemctl restart presto-server
Update the owner in the all - database, table policy of the privacera_hive service.
Configure Hive policy authentication
When the Privacera Plugin is deployed in your PrestoSQL server, the HIVE_POLICY_AUTHZ_ENABLED is set to true by default, allowing you to configure Hive policy authorization.
You can enable/disable the authorization in your PrestoSQL server. To configure, do the following:
Go to the Ranger PrestoSQL config folder.
cd /opt/privacera/plugin/ranger-x-x-x-x-presto-plugin
Run the following command:
vi install.properties
Add/Edit the following property. By default, the value is set to true.
HIVE_POLICY_AUTHZ_ENABLED=true
Run the following command:
./enable-presto-plugin.sh
Restart the PrestoSQL server.
sudo systemctl restart presto-server
Trino
Start Trino shell.
trino-cli --catalog hive
Create the schema using admin/superuser.
CREATE SCHEMA customer WITH (location = 's3a://${SECURE_BUCKETNAME}/trino_data/schema/customer’); use customer;
Create the table using admin/superuser
use customer; CREATE TABLE customer_data( id varchar, name varchar, ssn varchar, email_address varchar, address varchar) WITH ( format = 'textfile', external_location = 's3a://${SECURE_BUCKETNAME}/trino_data/table/customer_data' );
Exit from Trino-CLI and switch to {TEST_USER} and kinit and try sample policy.
trino-cli --catalog hive use customer; SELECT * FROM customer_data LIMIT 10;
Data_Admin access
Prerequisites
You need DATA_ADMIN permission in Ranger,
The source table requires the DATA_ADMIN Ranger policy.
Use case
You have the employee_db
database with two tables containing the following data:
#Requires create privilege on the database enabled by default; create schema if not exists employee_db;
In Privacera Portal select Access Management, then from the list of resource policy groups select privacera_hive which is under SQL. Then click +ADD NEW POLICY.
For the Policy Detail:
Policy Type: Access
Policy Name: Employees Schema Create Permission
Database:
employee_db
Table:
*
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
Create
Click SAVE.
Create two tables.
#Requires create privilege on the table level; CREATE TABLE IF NOT EXISTS employee_db.employee_data(id int, userid string, country string); CREATE TABLE IF NOT EXISTS employee_db.country_region(country string, region string);
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Table Create Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
Create
Insert test data.
#Requires update privilege on the table level; insert into employee_db.country_region values ('US','NA'), ('CA','NA'), ('UK','UK'), ('DE','EU'), ('FR','EU'); insert into employee_db.employee_data values (1,'james','US'),(2,'john','US'), (3,'mark','UK'), (4,'sally-sales','UK'),(5,'sally','DE'), (6,'emily','DE');
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Table Insert Permission
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
update
,Create
Create a view of above two tables created. You will get an ERROR like the following:
Query 20210223_051227_00005_nyxtw failed: Access Denied: Cannot create view tbl_view_5
You need
Create View
permission.In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Create View Permission
Database:
employee_db
Table:
tbl_view_1
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
Create
After granting
Create View
permission, the query will return the following error message:Query 20210223_050930_00004_nyxtw failed: Access Denied: User [emily] does not have [DATA_ADMIN] privilege on [hive/employee_db/employee_data]
You need to grant ‘Data_Admin’ permission for both the tables as mentioned below and execute the create view query again.
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Access
Policy Name: Employee Create View Permission - Data_admin
Database:
employee_db
Table:
employee_data
,country_region
Column:
*
Under Allow Conditions:
Select User:
trino
Permissions:
update
,Create
,Data_admin
Note
Granting Data_admin privileges on the resource implicitly grants Select privilege on the same resource as well.
Alter view
Create view
trino:customer> create view tbl_view_1 as SELECT * FROM tbl_1; CREATE VIEW trino:customer> SELECT * FROM tbl_view_1; c0 | c1 | c2 | c3 | c4 ----+--------+-----------+-----------------------+------------------------ 2 | James | 892821225 | james@walt.com | 4578 Extension xxx 1 | Dennis | 619821225 | thomasashley@walt.com | 9478 Anthony Extension 3 | Sally | 092341225 | sally@walt.com | 5678 Extension xyxx (3 rows) Query 20210303_142252_00006_g76nu, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 1.86 [3 rows, 169B] [1 rows/s, 91B/s]
Alter view
trino:customer> CREATE OR REPLACE VIEW tbl_view_1 as SELECT * FROM tbl_3; CREATE VIEW trino:customer> SELECT * FROM tbl_view_1; slno | name | mobile | email | address ------+------+--------+---------+--------- 1 | emily | 1234 | s@s.com | in (1 row) Query 20210303_142341_00009_g76nu, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.91 [1 rows, 0B] [1 rows/s, 0B/s]
Rename view
trino:customer> alter view tbl_view_1 rename to tbl_view_2; RENAME VIEW trino:customer>
Drop view
trino:customer> drop view tbl_view_1; DROP VIEW trino:customer>
Row level filter
trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_060602_00022_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 4.11 [6 rows, 0B] [1 rows/s, 0B/s]
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Row Level Filter
Policy Name: Employee Row Level Filter by Country
Hive Database:
employee_db
Hive Table:
tbl_view_1
Under Row Level Conditions:
Select User:
trino
Permissions:
select
Row Level Filter:
country=US
trino:employee_db> trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+--------+--------- 1 | james | US 2 | john | US (2 rows) Query 20210309_061202_00024_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.45 [6 rows, 0B] [13 rows/s, 0B/s]
Column masking
trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | US 2 | john | US 3 | mark | UK 4 | sally-sales | UK 5 | sally | DE 6 | emily | DE (6 rows) Query 20210309_062000_00027_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.30 [6 rows, 0B] [20 rows/s, 0B/s]
In Privacera Portal, create a policy with the following Policy Detail:
Policy Type: Masking
Policy Name: Employees Columns Masking Country
Hive Database:
employee_db
Hive Table:
tbl_view_1
Hive Column:
country
Under Masking Conditions:
Select User:
trino
Permissions:
select
Select Masking Option:
Nullify
trino:employee_db> trino:employee_db> SELECT * FROM tbl_1; id | userid | country ----+-------------+--------- 1 | james | NULL 2 | john | NULL 3 | mark | NULL 4 | sally-sales | NULL 5 | sally | NULL 6 | emily | NULL (6 rows) Query 20210309_061856_00026_5amn7, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.32 [6 rows, 0B] [18 rows/s, 0B/s]
Access views in AWS Athena
Use the following steps to provide access for views created in AWS Athena. As a result, you will be able to query the views.
Copy the Hive catalog properties (or create a symlink) as
awsdatacatalog.properties
in /etc/trino/conf/catalog folder.ln -s /etc/trino/conf/catalog/hive.properties /etc/trino/conf/catalog/awsdatacatalog.properties
Restart the Trino server.
sudo systemctl restart trino-server
In Access Management > Resource Policies, update the
privacera_hive
default policy.Edit
all - database, table
policy.In Select User, add 'Trino' from the dropdown as the default view owner, and save.
(Optional) To change the default view owner from 'Trino' to any other owner such as 'Hadoop', do the following:
In the
access-control.properties
file, add the owner to theranger.policy.authorization.viewowner.default
variable.vi/usr/lib/trino/etc/access-control.propertiesranger.policy.authorization.viewowner.default=<view-owner>
Restart the Trino server.
sudo systemctl restart trino-server
Accordingly, update the owner in the all - database, table policy of the privacera_hive service.
Hue
SSH to the master node.
Edit the
hue.ini
file.sudo vi /etc/hue/conf/hue.ini
For PrestoDB
In the interpreters > presto section, set the user to empty (
""
) so that it uses the credentials of a Hue logged-in user for authorization.[[interpreters]] [[[presto]]] interface = jdbc name = Presto options = '{"url": "jdbc:presto://${master_node_dns}:8889/hive/default", "driver": "com.facebook.presto.jdbc.PrestoDriver", "user":"","password":""}'
For PrestoSQL
In the interpreters > presto section, set the user to empty (
""
) so that it uses the credentials of a Hue logged-in user for authorization.[[interpreters]] [[[presto]]] interface = jdbc name = Presto options = '{"url": "jdbc:presto://${master_node_dns}:8889/hive/default", "driver": "io.prestosql.jdbc.PrestoDriver", "user":"","password":""}'
For Trino
In the interpreters > trino section, set the user to empty (
""
) so that it uses the credentials of a Hue logged-in user for authorization.[[interpreters]] [[[trino]]] interface = jdbc name = Trino options = '{"url": "jdbc:trino://${master_node_dns}:8889/hive/default", "driver": "io.trino.jdbc.TrinoDriver", "user":"","password":""}'
For SparkSQL
In the spark section, replace
sql_server_host
with the DNS name of the EMR master node.[spark] sql_server_host=${master_node_dns}
Restart the Hue service.
sudo systemctl restart hue.service
Login to Hue console through /<master-node>:8888
Set the Admin username and password.
Add more Hue users through the Admin console.
Logout and login using the newly created user in Hue console.
Access the tables through Hive/Presto.
Check in Privacera Ranger, to ensure username is the same as the user logged in to Hue.
Livy
Setup Livy and Zeppelin.
SSH with port forwarding or open 8890 port to access Zeppelin from the web browser.
ssh -i ${KEY_FILE} -L 8890:localhost:8890 hadoop@${EMR_PUBLIC_DNS}
Go to Zeppelin web UI (http://localhost:8890).
Enable the user based login (https://zeppelin.apache.org/docs/0.6.2/security/shiroauthentication.html).
sudo su cp /etc/zeppelin/conf/zeppelin-site.xml.template /etc/zeppelin/conf/zeppelin-site.xml chown zeppelin:zeppelin /etc/zeppelin/conf/zeppelin-site.xml vi /etc/zeppelin/conf/zeppelin-site.xml #Change the property, if exists #This property removed from zeppelin 0.9.0 (https://issues.apache.org/jira/browse/ZEPPELIN-4489) zeppelin.anonymous.allowed=false cp /etc/zeppelin/conf/shiro.ini.template /etc/zeppelin/conf/shiro.ini vi /etc/zeppelin/conf/shiro.ini #Add required users in [users] as below -- [users] hadoop = hadoop123, admin chown zeppelin:zeppelin /etc/zeppelin/conf/shiro.ini
Check Livy port using the following command.
vi /etc/livy/conf/livy.conf livy.server.port=8998
Stop and restart Zeppelin.
sudo stop zeppelin sudo start zeppelin
Go to /<master-node>:8890. Login with the required username/password which you have created in step 3.
Go to Settings > Interpreter > Livy > Edit and perform the following steps:
Keep only Scope with per user.
Set the following properties:
livy.spark.driver.cores=1
livy.spark.driver.memory=1g
livy.spark.executor.cores=1
livy.spark.executor.instances=2
livy.spark.executor.memory=1g
livy.spark.driver.extraClassPath=/opt/privacera/plugin/privacera-spark-plugin/spark-plugin/*:{copy spark.driver.extraClassPath from /etc/spark/conf/spark-defaults.conf}
Save and restart.
Run the sample Livy Spark code.
Go to Zeppelin web UI (http://localhost:8890).
Create a new notebook using the following command:
%livy.spark val df =spark.read.csv("s3://${SECURE_BUCKET_NAME}/sample_data/customer_data/customer_data_without_header.csv"); df.show()
Check audit for the above executed command in Privacera Access Manager:
On the Privacera Portal home page, expand Access Management.
On the left menu, click Audit.
The Audit page will be displayed with Ranger Audit details.
Spark Object-Level Access Control (OLAC)
To enable Spark OLAC:
Submit Spark applications
You can submit an application consisting of compiled and packaged Java or Spark JAR. You can deploy the JAR locally (client) or cluster.
Client Mode
SSH to the master node.
Run the following command:
spark-submit \ --master yarn \ --driver-memory 512m \ --executor-memory 512m \ --class <clas-to-run> <your-jar> <arg1> <arg2>
Cluster Mode
SSH to the master node.
Run the following command:
spark-submit \ --master yarn \ --deploy-mode cluster \ --driver-memory 512m \ --executor-memory 512m \ --driver-class-path "/opt/privacera/plugin/privacera-spark-plugin/spark-plugin/*:<copy spark.driver.extraClassPath from /etc/spark/conf/spark-defaults.conf>"\ --class <clas-to-run> <your-jar> <arg1> <arg2>
Spark Fine-grained Access Control (FGAC)
View-level access
To enable the view-level access control:
SSH to the master node of EMR cluster.
Edit the
spark-defaults.conf
file.sudo vim /etc/spark/conf/spark-defaults.conf
Add the following property.
spark.hadoop.privacera.spark.view.levelmaskingrowfilter.extension.enable true
To learn how to use view-level access control in Spark, Spark Fine-grained Access Control (FGAC).