Cloud Data Access
Also available as:
PDF
loading table of contents...

Configuring S3Guard in Ambari

After you have created DynamoDB access policy, set the following S3Guard configuration parameters for each S3 bucket that you want to "guard".

  1. In Ambari web UI, select the HDFS service and navigate to Configs > Advanced > Custom hdfs-site.

  2. Set the following configuration parameters for each bucket that you want to "guard". To configure S3Guard for a specific bucket, replace fs.s3a. with the fs.s3a.bucket.<bucketname>. where "bucketname" is the name of your bucket.

  3. After you've added the configuration parameters, click Save to save the configuration changes.

  4. Restart all affected components:

Table 3.2. S3Guard Configuration Parameters

Base ParameterDefault ValueSetting for S3Guard
fs.s3a.metadatastore.implorg.apache.hadoop.fs.s3a.s3guard.NullMetadataStoreSet this to “org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore”.
fs.s3a.s3guard.ddb.table.createfalseSet this to “true” to automatically create the DynamoDB table.
fs.s3a.s3guard.ddb.table(Unset)

Enter a name for the table that will be created in DynamoDB for S3Guard.

If you leave this blank while setting fs.s3a.s3guard.ddb.table.create to “true”, a separate DynamoDB table will be created for each accessed bucket. For each bucket, the respective S3 bucket name being used as the DynamoDB table name. This may incur additional costs.

fs.s3a.s3guard.ddb.region(Unset)

Set this parameter to one of the values from AWS. Refer to Amazon documentation. The “region” column value needs to be set as this parameter value.

If you leave this blank, the same region as where the S3 bucket is will be used.

fs.s3a.s3guard.ddb.table.capacity.read500Specify read capacity for DynamoDB or use the default. You can monitor the DynamoDB workload in the DynamoDB console on AWS portal and adjust the read/write capacities on the fly based on the workload requirements.
fs.s3a.s3guard.ddb.table.capacity.write100Specify write capacity for DynamoDB or use the default. You can monitor the DynamoDB workload in the DynamoDB console on AWS portal and adjust the read/write capacities on the fly based on the workload requirements.

Example

Adding the following custom properties will create a DynamoDB table called “my-table” in the “us-west-2” region (where the "test" bucket is located). The configuration will be valid for a bucket called "test", which means that “my-table” will only be used for storing metadata related to this bucket.

<property>
  <name>fs.s3a.bucket.test.metadatastore.impl</name>
  <value>org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore</value>
</property>

<property>
  <name>fs.s3a.bucket.test.s3guard.ddb.table</name>
  <value>my-table</value>
</property>

<property>
  <name>fs.s3a.bucket.test.s3guard.ddb.table.create</name>
  <value>true</value>
</property>

<property>
  <name>fs.s3a.bucket.test.s3guard.ddb.region</name>
  <value>us-west-2</value>
</property>