Troubleshooting for Access Management for EMR¶
Accessing S3 Buckets Containing a 'dot' in the Name in EMR 6.x and above¶
In EMR version 6.x and above, you may encounter an error when attempting to read from or write to an S3 bucket that contains a dot (.) in its name using the s3a protocol in PySpark or Spark shell. This issue is caused by a problem with the AWS SDK.
You need to enable path-style access for buckets with dots and give the properties shown below:
Bash | |
---|---|
Bash | |
---|---|
Delta Table Creation Fails with S3 Protocol¶
When creating a Delta table using the s3
protocol in AWS EMR, the table creation fails as expected when no policy is applied. However, after applying the required permissions, the table creation still fails with the following exception:
Text Only | |
---|---|
To successfully create a Delta table without encountering exceptions, follow these steps:
-
Check for Auto-Generated Folders:
- After running the Delta table creation query, verify whether any
_$folder$
directories exist in the specified S3 location.
- After running the Delta table creation query, verify whether any
-
Manually Delete Unwanted Folders:
-
If the following folders are present in AWS S3, then you will need to work with your administrator to delete them in AWS S3 directly or through Privacera S3 browser.
Note
The following folder structure is an example. The actual folders may vary based on the table location used in your query.
<hms_database>_$folder$
<hms_database>/<delta_tables>_$folder$
<hms_database>/<delta_tables>/<table_1>_$folder$
<hms_database>/<delta_tables>/<table_1>/_delta_log_$folder$
-
-
Retry the Delta Table Creation Query:
- After removing the unwanted folders, re-run your Delta table creation SQL query.
Note
- The Delta library automatically creates these folders (ending with
_$folder$
) during the table creation process. - If the user lacks the necessary permissions to the S3 bucket, these folders are not cleaned up, as the query execution is halted due to the permission issue.
- We recommend performing manual cleanup before retrying the query with the required permissions.
- The same issue can occur even without Privacera if the IAM role has permission to create/delete
_$folder$
objects but lacks permission to the actual table location.
- Prev topic: Advanced Configuration