3. Configure Flume

You configure Flume by using a properties file, which is specified on Flume start-up. The init scripts installed by flume-agent bring up a single Flume agent on any host, using the contents of /etc/flume/conf/flume-conf.

[Tip]Tip

Hadoop administrators planning to run Flume as a service must assign the name agent as the service name for all relevant configuration settings in flume-conf.

To see what configuration properties you can adjust, a template for this file is installed in the configuration directory at: /etc/flume/conf/flume-conf.properties.template. A second template file exists for setting environment variables automatically at start-up: /etc/flume/conf/flume-env.sh.template.

Common configuration option choices include the following:

  • Set primary configuration options in /etc/flume/conf/flume-conf:

    • If you are using the HDFS sink make sure the target folder is in HDFS

  • Set environment options in /etc/flume/conf/flume-env.sh:

    • To enable JMX monitoring, add the following properties to JAVA_OPTS

      JAVA_OPTS="-Dcom.sun.management.jmxremote
      -Dcom.sun.management.jmxremote.port=4159
      -Dcom.sun.management.jmxremote.authenticate=false
      -Dcom.sun.management.jmxremote.ssl=false"
    • To enable Ganglia monitoring, add the following properties to JAVA_OPTS

      JAVA_OPTS="-Dflume.monitoring.type=ganglia
      -Dflume.monitoring.hosts=<ganglia-server>:8660" 

      Where <ganglia-server> is the name of the Ganglia server host.

    • To optimize the heap size, add the following properties to JAVA_OPTS

      JAVA_OPTS= "-Xms100m -Xmx200m"
  • Set the log directory for log4j in /etc/flume/conf/log4j.properties

    flume.log.dir=/var/log/flume

loading table of contents...