Wednesday, 13 June 2018

Apache Nifi Installation on Ubuntu

Apache Nifi Installation on Ubuntu:
Step 1: Download from Apache Nifi website and extract Nifi package in desired directory

Apache Nifi website link to download : https://nifi.apache.org/download.html

There could be two distributions:

  1. ends with tar.gz - for Linux
  2. ends with zip - for Windows

Screenshot for reference:



Extract the distribution:

Command: tar -xvf /home/mano/Hadoop_setup/nifi-1.6.0-bin.tar.gz

Screenshot for reference:


Step 2: Configuration

NiFi provides several different configuration options which can be configured on nifi.properties file.

At present, i'm just making change to nifi.ui.banner.text property.





Step 3: Starting Apache Nifi:
On the terminal window,navigate to the Nifi directory and run the following below commands:

  • bin/nifi.sh run - Lauches the applicaion run in  the foreground and exit by pressing Ctrl-c.
  • bin/nifi.sh start - Lauches the application run the background.
  • bin/nifi.sh status - To check the application status
  • bin/nifi.sh stop - To shutdown the application

i)bin/nifi.sh run:



ii) bin/nifi.sh start:

iii) bin/nifi.sh status:

iv)bin/nifi.sh stop:



Step 4: Apache Nifi Web User Interface: 
After Apache Nifi Started, Web User Interface (UI) to create and monitor our dataflow.


To use Apache Nifi, open a web browser and navigate to http://localhost:8080/nifi




Friday, 1 June 2018

Steps to install Apache Spark on Ubuntu

Steps to install Apache Spark on Ubuntu

Step 1: Download Apache Spark distribution

Use the link to download the spark distribution ==> http://spark.apache.org/downloads.html






Download from terminal using below command:

wget http://www-eu.apache.org/dist/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz



Step 2: Untar the Spark distribution

tar xzf spark-1.6.1-bin-hadoop2.6.tgz

 
Step 3: Setup the environment variable:

set SPARK_HOME=/usr/local/spark

Follow below steps to set spark environment variables in .bashrc file.  

nano .bashrc


 

Step 4: Launch Spark shell /pyspark context:

scala API command line: 
run spark-shell to enter into scala context.

Python API command line:

run pyspark to enter into python context.


R API command line:

run sparkR to enter into R context.

 Step 5: Spark UI:
Enter the below url in the browser to check spark execution or DAG    information to debug etc.

URL ==> http://localhost:4040
 
 Done, it's great step to proceed further data processing using Apache Spark.


Fundamentals of Python programming

Fundamentals of Python programming: Following below are the fundamental constructs of Python programming: Python Data types Python...