Skip to content

Troubleshooting for Access Management for EMR

Accessing S3 Buckets Containing a 'dot' in the Name in EMR 6.x and above

In EMR version 6.x and above, you may encounter an error when attempting to read from or write to an S3 bucket that contains a dot (.) in its name using the s3a protocol in PySpark or Spark shell. This issue is caused by a problem with the AWS SDK.

Text Only
com.amazonaws.SdkClientException: Unable to execute HTTP request: Certificate for <{bucket-name-with-name}.east.us.s3.amazonaws.com> doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]: Unable to execute HTTP request: Certificate for <{bucket-name-with-name}.s3.amazonaws.com> doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]

You need to enable path-style access for buckets with dots and give the properties shown below:

Bash
pyspark --conf "spark.hadoop.fs.s3a.path.style.access=true"
Bash
spark-shell --conf "spark.hadoop.fs.s3a.path.style.access=true"

Comments