Category: Mayura International

How to upload Notebook to Anaconda Cloud (CLI Option)

Step #1

Check if the Anaconda User is signed into the Anaconda Cloud from the Anaconda Navigator

C:\> anaconda whoami

This should display:

  Username: gudiseva


Step #2

If Anaconda User is not signed in, then …

C:\> anaconda login

… provide Anaconda Username and Password


Step #3

Navigate to the folder where the notebook file is saved and upload the notebook to Anaconda Cloud

C: \> anaconda upload notebook.ipynb


Continue reading

Run Scala script in Production


[hadoop@ip-xx-y-zz-12 ~]$ vim /mnt1/analytics/arvind/deploy/
export HTTP_HOST=xx.y.zz.12
exec java -classpath "/mnt1/analytics/arvind/deploy/scala-redis.jar" "com.csscorp.restapi.RestServer" "$0" "$@"
[hadoop@ip-xx-y-zz-12 deploy]$ chmod -R 777
[hadoop@ip-xx-y-zz-12 deploy]$ nohup sh > ./scalaredis.log 2>&1 &
[hadoop@ip-xx-y-zz-12 deploy]$ jps
22938 RestServer
22961 Jps


Scala java.lang.NoSuchMethodError: scala.Predef$.augmentString error

Using Typesafe in Scala

Using Typesafe library of Scala, we can override the configuration settings.  This post describes the techniques:


libraryDependencies ++= Seq.apply(
  // --- Dependencies ---,
  "com.typesafe" % "config" % "1.2.1",
  // --- Dependencies ---
  _.excludeAll(ExclusionRule(organization = "org.mortbay.jetty"))


redis {
  host = "localhost"
  host = ${?REDIS_HOST}
  port = 6379
  port = ${?REDIS_PORT}


package com.csscorp.util

import com.typesafe.config.ConfigFactory

  * Created by Nag Arvind Gudiseva on 15-Dec-16.
object Config {

  /** Loads all key / value pairs from the application configuration file. */
  private val conf =  ConfigFactory.load()

  // Redis Configurations
  object RedisConfig {
    private lazy val redisConfig = conf.getConfig("redis")

    lazy val host = redisConfig.getString("host")
    lazy val port = redisConfig.getInt("port")



package tutorials.typesafe

import com.csscorp.util.Config.RedisConfig

  * Created by Nag Arvind Gudiseva on 26-Dec-16.
object TypeSafeTest {

  def main(args: Array[String]): Unit = {

    val redisHost =
    println(s"Hostname: $redisHost")
    val redisPort = RedisConfig.port
    println(f"Port: $redisPort")



C:\ScalaRedis> sbt clean compile package
C:\ScalaRedis> scala -classpath target/scala-2.10/scalaredis_2.10-1.0.jar tutorials.typesafe.TypeSafeTest
    Hostname: localhost
    Port: 6379

C:\ScalaRedis> SET REDIS_HOST=
C:\ScalaRedis> echo %REDIS_HOST%
C:\ScalaRedis> SET REDIS_PORT=1234
C:\ScalaRedis> echo %REDIS_PORT%

C:\ScalaRedis> scala -classpath target/scala-2.10/scalaredis_2.10-1.0.jar tutorials.typesafe.TypeSafeTest
    Port: 1234

As observed, the host name and port that is set in the system environment variables gets substituted.

Production (AWS EMR)

[hadoop@ip-xx-y-zz-12 arvind]$ echo $HTTP_PORT
[hadoop@ip-xx-y-zz-12 arvind]$ scala -classpath deploy/scala-redis.jar com.csscorp.restapi.RestServer
2016/12/27 05:56:04-286 [INFO] com.csscorp.restapi.RestServer$ - Hostname: localhost
2016/12/27 05:56:04-287 [INFO] com.csscorp.restapi.RestServer$ - Port: 8080

[hadoop@ip-xx-y-zz-12 arvind]$ export HTTP_PORT=8081
[hadoop@ip-xx-y-zz-12 arvind]$ echo $HTTP_PORT
[hadoop@ip-xx-y-zz-12 arvind]$ scala -classpath deploy/scala-redis.jar com.csscorp.restapi.RestServer
2016/12/27 05:59:39-304 [INFO] com.csscorp.restapi.RestServer$ - Hostname: localhost
2016/12/27 05:59:39-305 [INFO] com.csscorp.restapi.RestServer$ - Port: 8081



Using Typesafe’s Config for Scala (and Java) for Application Configuration

Scala – Head, Tail, Init & Last

Sample Program

object SequencesTest {

  // Conceptual representation
  println("""nag arvind gudiseva scala""")
  println("""---> HEAD""")
  println("""nag arvind gudiseva scala""")
  println("""... ---------------------> TAIL""")
  println("""nag arvind gudiseva scala""")
  println("""-------------------> INIT""")
  println("""nag arvind gudiseva scala""")
  println("""................... -----> LAST""")

  def main(arg: Array[String]): Unit = {

    val str1: String = "nag arvind gudiseva scala"
    val str1Arr: Array[String] = str1.split(" ")

    println("HEAD: " + str1Arr.head)
    println("TAIL: " + str1Arr.tail.deep.mkString)
    println("INIT: " + str1Arr.init.deep.mkString)
    println("LAST: " + str1Arr.last)






HEAD: nag
TAIL: arvindgudisevascala
INIT: nagarvindgudiseva
LAST: scala



Scala sequences – head, tail, init, last


Git Commands on Windows

Install Git on Windows

1. Download the latest Git for Windows stand-alone installer

2. Use the default options from Next and Finish.

3. Open Command Prompt

4. Configure Git username and email (be associated with any commits)

    Git global setup:

    C:\> git config --global "Nag Arvind Gudiseva"
    C:\> git config --global ""

Git Command line instructions

    A. Create a new repository

    C:\> git clone
    C:\> cd analytics-sample-project
    C:\> touch
    C:\> git add
    C:\> git commit -m "add README"
    C:\> git push -u origin master

B. Existing folder or Git repository

    C:\> cd existing_folder
    C:\> git init
    C:\> git remote add origin
    C:\> git add .
    C:\> git commit
    C:\> git push -u origin master

C. In Linux

    $ git config --global http.proxy ""
    $ cd ~/gudiseva
    $ git init
    $ git status
    $ git add arvind.jpeg
    $ git commit -m “Add arvind.jpeg.”
    $ git remote add origin
    $ git remote -v
    $ git remote status
    $ git remote diff
    $ git push


Jersey Multipart File Upload Maven Project

Create Maven Project with the below archetype (Jersey Version: 2.9; Jetty Version: 9.2.18.v20160721):

mvn archetype:generate -DarchetypeGroupId=org.glassfish.jersey.archetypes -DarchetypeArtifactId=jersey-quickstart-webapp -DarchetypeVersion=2.9

Apache FLUME Installation and Configuration in Windows 10

1. Download & install Java

2. Create a Junction Link.  (Needed as the Java Path contains spaces)

    C:\Windows\system32>mklink /J "C:\Program_Files" "C:\Program Files"
    Junction created for C:\Program_Files <<===>> C:\Program Files

3. Set Path and Classpath for Java


4. Download Flume

    Download apache-flume-1.7.0-bin.tar.gz

5. Extract using 7-Zip

 Move to C:\flume\apache-flume-1.7.0-bin directory

6. Set Path and Classpath for Flume


7. Download Windows binaries for Hadoop versions

8. Copy

To C:\hadoop\hadoop-2.6.0\bin

9. Set Path and Classpath for Hadoop


10. Edit file


11. Copy flume-env.ps1.template as flume-env.ps1.

    Add below configuration:
    $JAVA_OPTS="-Xms500m -Xmx1000m"

12. Copy as

13. — Flume Working Commands —

C:\> cd %FLUME_HOME%/bin
C:\flume\apache-flume-1.7.0-bin\bin> flume-ng agent –conf %FLUME_CONF% –conf-file %FLUME_CONF%/ –name agent

14. Install HDInsight Emulator (Hadoop) on Windows 10

a. Install Microsoft Web Platform Installer 5.0
b. Search for HDInsight
c. Select Add -> Install -> I Accept -> Finish
d Format Namenode

        C:\hdp> hdfs namenode -format

e. Start Hadoop and Other Services

        C:\hdp> start_local_hdp_services

f. Verify

        c:\hdp> hdfs dfsadmin -report

g. Hadoop Sample Commands

C:\hdp\hadoop-> hdfs dfs -ls hdfs://lap-04-2312:8020/
C:\hdp\hadoop-> hdfs dfs -mkdir hdfs://lap-04-2312:8020/users
C:\hdp\hadoop-> hdfs dfs -mkdir hdfs://lap-04-2312:8020/users/hadoop
C:\hdp\hadoop-> hdfs dfs -mkdir hdfs://lap-04-2312:8020/users/hadoop/flume

f. Stop Hadoop and Other Services

        C:\hdp> stop_local_hdp_services

13. — Other Flume Commands —

C:\> cd %FLUME_HOME%/bin
C:\flume\apache-flume-1.7.0-bin\bin> flume-ng agent –conf %FLUME_CONF% –conf-file %FLUME_CONF%/ –name SeqLogAgent
C:\flume\apache-flume-1.7.0-bin\bin> flume-ng agent –conf %FLUME_CONF% –conf-file %FLUME_CONF%/ –name SeqGenAgent
C:\flume\apache-flume-1.7.0-bin\bin> flume-ng agent –conf %FLUME_CONF% –conf-file %FLUME_CONF%/ –name TwitterAgent


Note: Sample Configurations and Properties are attached.

flume-conf-properties   seq_gen-properties   seq_log-properties

Apache Flume Installation & Configurations on Ubuntu 16.04

1. Download Flume

2. Extract Flume tar

$ tar -xzvf apache-flume-1.7.0-bin.tar.gz

3. Move to a folder

$ sudo mv apache-flume-1.7.0-bin /opt/
$ sudo mv apache-flume-1.7.0-bin apache-flume-1.7.0

4. Update the Path

$ gedit ~/.bashrc

export FLUME_HOME=/opt/apache-flume-1.7.0

5. Update the Flume Environment

$ cd conf/
$ cp
$ gedit

export JAVA_HOME=/usr/lib/jvm/java-8-oracle
$JAVA_OPTS=”-Xms500m -Xmx1000m”

$ cd ..

6. Update


7. Reload BashRc

$ source ~/.bashrc (OR) . ~/.bashrc

8. Test Flume

$ flume-ng –help

9. Start Hadoop

$ hadoop fs -ls hdfs://localhost:9000/nag
$ hdfs dfs -mkdir hdfs://localhost:9000/user/gudiseva/twitter_data

10. Run the following commands:

A. Twitter


$ bin/flume-ng agent –conf ./conf/ -f conf/twitter.conf Dflume.root.logger=DEBUG,console -n TwitterAgent
$ ./flume-ng  agent -n TwitterAgent  -c conf -f ../conf/twitter.conf
$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/twitter.conf Dflume.root.logger=DEBUG,console –name TwitterAgent

B. Sequence


$ bin/flume-ng agent –conf ./conf/ -f conf/seq_gen.conf Dflume.root.logger=DEBUG,console -n SeqGenAgent
$ ./flume-ng  agent -n SeqGenAgent  -c conf -f ../conf/seq_gen.conf
$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/seq_gen.conf –name SeqGenAgent

C. NetCat


$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/netcat.conf –name NetcatAgent -Dflume.root.logger=INFO,console

D. Sequence Logger


$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/seq_log.conf –name SeqLogAgent -Dflume.root.logger=INFO,console

E. Cat / Tail File Channel


$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/ –name a1 -Dflume.root.logger=INFO,console

F. Spool Directory


$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/ –name agent -Dflume.root.logger=INFO,console

G. Default Template


$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/ –name agent -Dflume.root.logger=INFO,console

G. Multiple Sinks


$ ./bin/flume-ng agent –conf $FLUME_CONF –conf-file $FLUME_CONF/ –name flumeAgent -Dflume.root.logger=INFO,LOGFILE

11. Netcat

A. Check if Netcat is installed or not
   $ which netcat
   $ nc -h
   B. Else, install Netcat
   $ sudo apt-get install netcat
   C. Netcat in listening (Server) mode
   $ nc -l -p 12345
   D. Netcat in Client mode
   $ nc localhost 12345
   $ curl telnet://localhost:12345
   E. Netcat as a Client to perform Port Scanning
   $ nc -v hostname port
   $ nc -v 80
     GET / HTTP/1.1


Note: Sample Configurations and Properties are attached.

cat_tail-properties; flume-conf-properties; multiple_sinks-properties; netcat-conf; seq_gen-conf; seq_log-conf; twitter-conf