This document is intended for system administrators who need to configure HDFS compression on Windows platform.
Windows supports GzipCodec
, DefaultCodec
, and BZip2Codec
. Typically, GzipCodec
is popularly
used for HDFS compression.
Ensure that zlib1.dll
is installed in the above mentioned locations on all the nodes of the
cluster.
Use the following instructions to use GZipCodec
Option I: To use GzipCodec with a one-time only job:
On the NamNode host machine, execute the following commands as
hdfs
user:hadoop jar hadoop-examples-1.1.0-SNAPSHOT.jar sort "-Dmapred.compress.map.output=true" "-Dmapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec" "-Dmapred.output.compress=true" "-Dmapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec" -outKey org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text input output
Option II: To enable GzipCodec as the default compression:
Edit the
core-site.xml
file on the NameNode host machine:<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec</value> <description>A list of the compression codec classes that can be used for compression/decompression.</description> </property>
Edit
mapred-site.xml
file on the JobTracker host machine:<property> <name>mapred.compress.map.output</name> <value>true</value> </property> <property> <name>mapred.map.output.compression.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property> <property> <name>mapred.output.compression.type</name> <value>BLOCK</value> </property>
[Optional] - Enable the following two configuration parameters to enable job output compression.
Edit
mapred-site.xml
file on the JobTracker host machine:<property> <name>mapred.output.compress</name> <value>true</value> </property> <property> <name>mapred.output.compression.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property>
Restart the cluster using instructions provided here.