Apache Flume Installation & Configurations on Ubuntu 16.04

1. Download Flume

2. Extract Flume tar

$ tar -xzvf apache-flume-1.7.0-bin.tar.gz

3. Move to a folder

$ sudo mv apache-flume-1.7.0-bin /opt/
$ sudo mv apache-flume-1.7.0-bin apache-flume-1.7.0

4. Update the Path

$ gedit ~/.bashrc

export FLUME_HOME=/opt/apache-flume-1.7.0
export FLUME_CONF_DIR=$FLUME_HOME/conf
export FLUME_CLASSPATH=$FLUME_CONF_DIR
export PATH=$PATH:$FLUME_HOME/bin

5. Update the Flume Environment

$ cd conf/
$ cp flume-env.sh.template flume-env.sh
$ gedit flume-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-oracle
$JAVA_OPTS=”-Xms500m -Xmx1000m -Dcom.sun.management.jmxremote”
export CLASSPATH=$CLASSPATH:/FLUME_HOME/lib/*

$ cd ..

6. Update log4j.properties

flume.root.logger=DEBUG,console
#flume.root.logger=INFO,LOGFILE

7. Reload BashRc

$ source ~/.bashrc (OR) . ~/.bashrc

8. Test Flume

$ flume-ng –help

9. Start Hadoop

$ start-all.sh
$ hadoop fs -ls hdfs://localhost:9000/nag
$ hdfs dfs -mkdir hdfs://localhost:9000/user/gudiseva/twitter_data

10. Run the following commands:

A. Twitter

$ cd $FLUME_HOME

$ bin/flume-ng agent –conf ./conf/ -f conf/twitter.conf Dflume.root.logger=DEBUG,console -n TwitterAgent
(OR)
$ ./flume-ng  agent -n TwitterAgent  -c conf -f ../conf/twitter.conf
(OR)
$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/twitter.conf Dflume.root.logger=DEBUG,console –name TwitterAgent

B. Sequence

$ cd $FLUME_HOME

$ bin/flume-ng agent –conf ./conf/ -f conf/seq_gen.conf Dflume.root.logger=DEBUG,console -n SeqGenAgent
(OR)
$ ./flume-ng  agent -n SeqGenAgent  -c conf -f ../conf/seq_gen.conf
(OR)
$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/seq_gen.conf –name SeqGenAgent

C. NetCat

$ cd $FLUME_HOME

$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/netcat.conf –name NetcatAgent -Dflume.root.logger=INFO,console

D. Sequence Logger

$ cd $FLUME_HOME

$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/seq_log.conf –name SeqLogAgent -Dflume.root.logger=INFO,console

E. Cat / Tail File Channel

$ cd $FLUME_HOME

$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/cat_tail.properties –name a1 -Dflume.root.logger=INFO,console

F. Spool Directory

$ cd $FLUME_HOME

$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/flume-conf.properties –name agent -Dflume.root.logger=INFO,console

G. Default Template

$ cd $FLUME_HOME

$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/flume-conf.properties.template –name agent -Dflume.root.logger=INFO,console

G. Multiple Sinks

$ cd $FLUME_HOME

$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/multiple_sinks.properties –name flumeAgent -Dflume.root.logger=INFO,LOGFILE

11. Netcat

A. Check if Netcat is installed or not
   $ which netcat
   $ nc -h
   B. Else, install Netcat
   $ sudo apt-get install netcat
   C. Netcat in listening (Server) mode
   $ nc -l -p 12345
   D. Netcat in Client mode
   $ nc localhost 12345
   $ curl telnet://localhost:12345
   E. Netcat as a Client to perform Port Scanning
   $ nc -v hostname port
   $ nc -v www.google.com 80
     GET / HTTP/1.1

 

Note: Sample Configurations and Properties are attached.

cat_tail-properties; flume-conf-properties; multiple_sinks-properties; netcat-conf; seq_gen-conf; seq_log-conf; twitter-conf

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s