Monday 14 August 2017

1)Apache Hive:

Apache Hive:

The Apache Hive is a data warehouse software that is built on top of Apache Hadoop for data analysis., that facilitates reading, writing, and managing large data sets residing in distributed storage(HDFS).

Note:

Hive provides a mechanism to impose structure for a variety of data formats on Hadoop and to query that data using a SQL-like language called HiveQL (HQL).


Hive was originated in Facebook.

Apache Hive provides the following features:
  • Hive tools to enable easy access to data via SQL interface, thus enabling data warehousing tasks such as extract/transform/load (ETL), reporting, and data analysis.
  • A mechanism to impose structure on a variety of data formats
  • Hive access to files stored either directly in Apache HDFS™ or in other data storage systems such as Apache HBase™ 
  • Hive Query execution via Apache Tez™, Apache Spark™, or MapReduce(Default)
Limitations of Hive:

Hive is not designed for Online transaction processing (OLTP ), it is only used for the Online Analytical Processing.

Hive supports overwriting data, but not updates and deletes.

Hive is used inspite of Pig?
  • Hive-QL is a declarative language like SQL, PigLatin is a data flow language.
  • Pig: a data-flow language and environment for exploring very large datasets.
  • Hive: a distributed data warehouse.
Components of Hive:

1)HCatalog:

HCatalog is a table and storage management layer for Hadoop that enables users with different data processing tools — Pig, MapReduce — to more easily read and write data on the grid.

2)WebHCat:

WebHCat provides a service that you can use to run Hadoop MapReduce (or YARN), Pig, Hive jobs or perform Hive metadata operations using an HTTP (REST style) interface.

Hive Execution engines and properties:


There are currently three execution engines , following are

 1.Defualt MapReduce engine,
  hive.execution.engine=mr
 2.TEZ engine,
  set hive.execution.engine=tez;
 3.Spark engine
  set hive.execution.engine=spark;
Please click next to proceed further ==> Next page

3 comments:

Fundamentals of Python programming

Fundamentals of Python programming: Following below are the fundamental constructs of Python programming: Python Data types Python...