Properties for configuring centralized caching
You must enable JNI to use centralized caching. In addition, you must configure various properties and consider the locked memory limit for configuring centralized caching.
In order to lock block files into memory, the DataNode relies on native JNI code found
libhadoop.so. Be sure to enable JNI if you are using HDFS
centralized cache management.
Configuration properties for centralized caching are specified in the
Only the following property is required:
dfs.datanode.max.locked.memoryThis property determines the maximum amount of memory (in bytes) that a DataNode will use for caching. The "locked-in-memory size" ulimit (
ulimit -l) of the DataNode user also needs to be increased to exceed this parameter (for more details, see the following section on ). When setting this value, remember that you will need space in memory for other things as well, such as the DataNode and application JVM heaps, and the operating system page cache. Example:
<property> <name>dfs.datanode.max.locked.memory</name> <value>268435456</value> </property>
The following properties are not required, but can be specified for tuning.
dfs.namenode.path.based.cache.refresh.interval.msThe NameNode will use this value as the number of milliseconds between subsequent cache path re-scans. By default, this parameter is set to 300000, which is five minutes. Example:
<property> <name>dfs.namenode.path.based.cache.refresh.interval.ms</name> <value>300000</value> </property>
dfs.time.between.resending.caching.directives.msThe NameNode will use this value as the number of milliseconds between resending caching directives. Example:
<property> <name>dfs.time.between.resending.caching.directives.ms</name> <value>300000</value> </property>
dfs.datanode.fsdatasetcache.max.threads.per.volumeThe DataNode will use this value as the maximum number of threads per volume to use for caching new data. By default, this parameter is set to 4. Example:
<property> <name>dfs.datanode.fsdatasetcache.max.threads.per.volume</name> <value>4</value> </property>
dfs.cachereport.intervalMsecThe DataNode will use this value as the number of milliseconds between sending a full report of its cache state to the NameNode. By default, this parameter is set to 10000, which is 10 seconds. Example:
<property> <name>dfs.cachereport.intervalMsec</name> <value>10000</value> </property>
dfs.namenode.path.based.cache.block.map.allocation.percentThe percentage of the Java heap that will be allocated to the cached blocks map. The cached blocks map is a hash map that uses chained hashing. Smaller maps may be accessed more slowly if the number of cached blocks is large. Larger maps will consume more memory. The default value is 0.25 percent. Example:
<property> <name>dfs.namenode.path.based.cache.block.map.allocation.percent</name> <value>0.25</value> </property>
If you get the error "Cannot start datanode because the configured max locked memory
size...is more than the datanode's available RLIMIT_MEMLOCK ulimit," this means that the
operating system is imposing a lower limit on the amount of memory that you can lock
than what you have configured. To fix this, you must adjust the
-l value that the DataNode runs with. This value is usually configured in
/etc/security/limits.conf, but this may vary depending on what
operating system and distribution you are using.
You have correctly configured this value when you can run
ulimit - l
from the shell and get back either a higher value than what you have configured or the
string "unlimited", which indicates that there is no limit. Typically,
-l returns the memory lock limit in kilobytes (KB), but
dfs.datanode.max.locked.memory must be specified in bytes.
For example, if the value of
dfs.datanode.max.locked.memory is set to
<property> <name>dfs.datanode.max.locked.memory</name> <value>128000</value> </property>
memlock (max locked-in-memory address space) to a slightly
higher value. For example, to set memlock to 130 KB (130,000 bytes) for the hdfs user,
you would add the following line to
hdfs - memlock 130