Configuring Fault Tolerance
Also available as:
PDF
loading table of contents...

Preventing Accidental Deletion of Files

You can prevent accidental deletion of files by enabling the Trash feature for HDFS.

For additional information regarding HDFS trash configuration, see HDFS Architecture.

You might still cause irrecoverable data loss if the -skipTrash and -R options are accidentally used on directories with a large number of files. You can obtain an additional layer of protection by using the -safely option to the fs shell -rm command. The fs shell -rm command checks the hadoop.shell.safely.delete.limit.num.files property from core-site.xml file, even if you specify -skipTrash. By specifying the -safely option, the -rm command requires that you confirm if the number of files to be deleted is greater than the limit specified by the assigned value. The default limit for value is 100, referring to 100 files.

This confirmation warning is disabled if value is set at 0 or the -safely is not specified to the -rm command.

To enable the hadoop.shell.safely.delete.limit.num.files property, add the following lines to core-site.xml:

<property>
<name>hadoop.shell.safely.delete.limit.num.files</name>
<value>100</value>
<description>Used by -safely option of hadoop fs shell -rm command to avoid
accidental deletion of large directories.</description>
</property>

In the following example, the hadoop.shell.safely.delete.limit.num.files property with an associated value of 10 has been added to core-site.xml with -skipTrash . In this example, fs shell -r prompts deletion of a directory with only 10 files. It does not prompt if trash is enabled and -skipTrash is not.

[ambari-qa@c6405 current]$ hdfs dfs -ls -R /tmp/test1
-rw-r--r--   3 ambari-qa hdfs       2413 2016-10-20 20:57 /tmp/test1/capacity-scheduler.xml
-rw-r--r--   3 ambari-qa hdfs       4435 2016-10-20 20:57 /tmp/test1/core-site.xml
-rw-r--r--   3 ambari-qa hdfs       1308 2016-10-20 20:57 /tmp/test1/hadoop-policy.xml
-rw-r--r--   3 ambari-qa hdfs       8071 2016-10-20 20:57 /tmp/test1/hdfs-site.xml
-rw-r--r--   3 ambari-qa hdfs       3518 2016-10-20 20:57 /tmp/test1/kms-acls.xml
-rw-r--r--   3 ambari-qa hdfs       5511 2016-10-20 20:57 /tmp/test1/kms-site.xml
-rw-r--r--   3 ambari-qa hdfs       7339 2016-10-20 20:57 /tmp/test1/mapred-site.xml
-rw-r--r--   3 ambari-qa hdfs        884 2016-10-20 20:57 /tmp/test1/ssl-client.xml
-rw-r--r--   3 ambari-qa hdfs       1000 2016-10-20 20:57 /tmp/test1/ssl-server.xml
-rw-r--r--   3 ambari-qa hdfs      20349 2016-10-20 20:57 /tmp/test1/yarn-site.xml
[ambari-qa@c6405 current]$ hdfs dfs -rm -R /tmp/test1
16/10/20 20:58:37 INFO fs.TrashPolicyDefault: Moved: 'hdfs://c6403.ambari.apache.org:8020/tmp/test1' to trash at: hdfs://c6403.ambari.apache.org:8020/user/ambari-qa/.Trash/Current/tmp/test1

The following example deletes files without prompting or moving to the trash:

[ambari-qa@c6405 current]$ hdfs dfs -ls -R /tmp/test2
-rw-r--r--   3 ambari-qa hdfs       2413 2016-10-20 20:59 /tmp/test2/capacity-scheduler.xml
-rw-r--r--   3 ambari-qa hdfs       4435 2016-10-20 20:59 /tmp/test2/core-site.xml
-rw-r--r--   3 ambari-qa hdfs       1308 2016-10-20 20:59 /tmp/test2/hadoop-policy.xml
-rw-r--r--   3 ambari-qa hdfs       8071 2016-10-20 20:59 /tmp/test2/hdfs-site.xml
-rw-r--r--   3 ambari-qa hdfs       3518 2016-10-20 20:59 /tmp/test2/kms-acls.xml
-rw-r--r--   3 ambari-qa hdfs       5511 2016-10-20 20:59 /tmp/test2/kms-site.xml
-rw-r--r--   3 ambari-qa hdfs       7339 2016-10-20 20:59 /tmp/test2/mapred-site.xml
-rw-r--r--   3 ambari-qa hdfs        884 2016-10-20 20:59 /tmp/test2/ssl-client.xml
-rw-r--r--   3 ambari-qa hdfs       1000 2016-10-20 20:59 /tmp/test2/ssl-server.xml
-rw-r--r--   3 ambari-qa hdfs      20349 2016-10-20 20:59 /tmp/test2/yarn-site.xml
[ambari-qa@c6405 current]$ hdfs dfs -rm -R -skipTrash /tmp/test2
Deleted /tmp/test2

The following example prompts for you to confirm file deletion if the number of files to be deleted is greater than the value specified to hadoop.shell.safely.delete.limit.num.files:

[ambari-qa@c6405 current]$ hdfs dfs -ls -R /tmp/test3
-rw-r--r--   3 ambari-qa hdfs       2413 2016-10-20 21:00 /tmp/test3/capacity-scheduler.xml
-rw-r--r--   3 ambari-qa hdfs       4435 2016-10-20 21:00 /tmp/test3/core-site.xml
-rw-r--r--   3 ambari-qa hdfs       1308 2016-10-20 21:00 /tmp/test3/hadoop-policy.xml
-rw-r--r--   3 ambari-qa hdfs       8071 2016-10-20 21:00 /tmp/test3/hdfs-site.xml
-rw-r--r--   3 ambari-qa hdfs       3518 2016-10-20 21:00 /tmp/test3/kms-acls.xml
-rw-r--r--   3 ambari-qa hdfs       5511 2016-10-20 21:00 /tmp/test3/kms-site.xml
-rw-r--r--   3 ambari-qa hdfs       7339 2016-10-20 21:00 /tmp/test3/mapred-site.xml
-rw-r--r--   3 ambari-qa hdfs        884 2016-10-20 21:00 /tmp/test3/ssl-client.xml
-rw-r--r--   3 ambari-qa hdfs       1000 2016-10-20 21:00 /tmp/test3/ssl-server.xml
-rw-r--r--   3 ambari-qa hdfs      20349 2016-10-20 21:00 /tmp/test3/yarn-site.xml
[ambari-qa@c6405 current]$ hdfs dfs -rm -R -skipTrash -safely /tmp/test3
Proceed deleting 10 files? (Y or N) N
Delete aborted at user request.
Note
Note

Using the -skipTrash option without the -safely option is not recommended, as files will be deleted immediately and without warning.