Accessing Cloud Data
Also available as:
PDF
loading table of contents...

Defining Authentication Providers

The S3A connector can be configured to obtain client authentication providers from classes which integrate with the AWS SDK by implementing the com.amazonaws.auth.AWSCredentialsProvider interface. This is done by listing the implementation classes in the configuration option fs.s3a.aws.credentials.provider.

[Note]Note

AWS credential providers are distinct from Hadoop credential providers. Hadoop credential providers allow passwords and other secrets to be stored and transferred more securely than in XML configuration files. In contrast, AWS credential providers are classes which can be used by the Amazon AWS SDK to obtain an AWS login from a different source in the system, including environment variables, JVM properties, and configuration files.

There are a number of AWS credential provider classes specified in the hadoop-aws JAR:

ClassnameDescription
org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProviderStandard credential support through configuration properties. It does not support in-URL authentication.
org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProviderSession authentication
org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProviderAnonymous login. Use for accessing public data without providing any credentials at all.

Furthermore, there are many AWS credential provider classes specified in the Amazon JARs. In particular, there are two which are commonly used:

ClassnameDescription
com.amazonaws.auth.EnvironmentVariableCredentialsProviderAWS Environment Variables
com.amazonaws.auth.InstanceProfileCredentialsProviderEC2 Metadata Credentials

The order of listing credential providers in the configuration option fs.s3a.aws.credentials.providerdefines the order of evaluation of credential providers.

The standard authentication mechanism for Hadoop S3A authentication is the following list of providers:

<property>
  <name>fs.s3a.aws.credentials.provider</name>
  <value>
  org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider,
  com.amazonaws.auth.EnvironmentVariableCredentialsProvider,
  com.amazonaws.auth.InstanceProfileCredentialsProvider
  </value>
</property>
[Note]Note

Retrieving credentials with the InstanceProfileCredentialsProvider is a slower operation than looking up configuration operations or environment variables. It is best to list it after all other authentication providers — excluding the AnonymousAWSCredentialsProvider, which must come last.