Accessing Cloud Data
Also available as:
PDF
loading table of contents...

Using Anonymous Login

You can configure anonymous access to a publicly accessible Amazon S3 bucket without using any credentials. This can be useful for accessing public data sets.

[Note]Note

Allowing anonymous access to an Amazon S3 bucket compromises security and therefore is unsuitable for most use cases.

To use anonymous login, specify org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider:

<property>
  <name>fs.s3a.aws.credentials.provider</name>
  <value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value>
</property>

Once this is done, there is no need to supply any credentials in the Hadoop configuration or via environment variables.

This option can be used to verify that an object store does not permit unauthenticated access; that is, if an attempt to list a bucket is made using the anonymous credentials, it should fail — unless explicitly opened up for broader access.

hadoop fs -ls \
  -D fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider \
  s3a://landsat-pds/

S3A may be configured to always access specific buckets anonymously. For example, the following configuration defines anonymous access to the public landsat-pds bucket accessed via s3a://landsat-pds/ URI:

<property>
  <name>fs.s3a.bucket.landsat-pds.aws.credentials.provider</name>
  <value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value>
</property>
[Note]Note

If a list of credential providers is given in fs.s3a.aws.credentials.provider, then the anonymous credential provider must come last. If not, credential providers listed after it will be ignored.