Saturday 5 August 2017

4)Hadoop-HIVE tables:

Hadoop-HIVE tables:

There are 2 types of tables in Hive

  1. Managed table in HIVE
  2. External table in HIVE

1)Managed table in HIVE:
Description:
Managed table in HIVE are like normal database table where data can be stored and queried on. On dropping these tables the data stored in them also gets deleted and data is lost forever.

Note: So while using Managed tables as one drop command can destroy the whole data.

Creating a Managed Table using Hive on the HDFS File
Syntax:

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[STORED AS file_format];

Example:

HIVE>CREATE TABLE IF NOT EXISTS STUDENT_INTER(ID INT,SNAME STRING,CITY STRING) row format delimited fields terminated by '|' STORED AS TEXTFILE;

ROW FORMAT should have delimiters used to terminate the fields and lines like in the example the fields are terminated with comma (“|”). 

The table get's created on the default location of Hive table(/user/hive/warehouse)

use LOCATION, to overwrite the default location of Hive table, 

Example:

CREATE TABLE IF NOT EXISTS STUDENT_INTER1(ID INT,SNAME STRING,CITY STRING) row format delimited fields terminated by '|' STORED AS TEXTFILE LOCATION '/Hive_test';

Load the Data in Hive Table:

Data can be loaded in 2 ways in Hive Table:

  1. From local File to Hive or 
  2. From HDFS to Hive.


1)From local File to Hive:

Syntax:

LOAD DATA LOCAL INPATH 'path' INTO TABLE <Table_name>;

Example:

LOAD DATA LOCAL INPATH '/home/hadoop/Mano/Hive/student.txt' INTO TABLE STUDENT_INTER;

Output:
hive> LOAD DATA LOCAL INPATH '/home/hadoop/Mano/Hive/student.txt' INTO TABLE STUDENT_INTER;
Loading data to table test.student_inter
OK
Time taken: 1.015 seconds

2)From HDFS to Hive
Syntax:
LOAD DATA INPATH 'path' INTO TABLE <Table_name>;

Example:
LOAD DATA INPATH '/Hive_test/student.txt' INTO TABLE STUDENT_INTER1;

Output:
hive> LOAD DATA INPATH '/Hive_test/student.txt' INTO TABLE STUDENT_INTER1;

Drop table:
On dropping the table loaded from HDFS to Hive,the data gets deleted and there is no copy of data on HDFS. This means that on creating internal table the data gets moved from HDFS to Hive.

Table can be dropped using:

hive> drop table student_inter;
OK
Time taken: 0.089 seconds

2)External table in HIVE:
Description:
External table in HIVE stores files on the HDFS server but tables are not linked to the source file completely. If we make delete an external table the file still remains on the HDFS server.The file and the table link is there but read only.

Creating a External Table using Hive on the HDFS File

Syntax:

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[STORED AS file_format];



Example:

CREATE TABLE IF NOT EXISTS STUDENT_EXTER(ID INT,SNAME STRING,CITY STRING) row format delimited fields terminated by '|' STORED AS TEXTFILE LOCATION '/hiveext';

ROW FORMAT should have delimiters used to terminate the fields and lines like in the example the fields are terminated with comma (“|”).

Mandatory to use LOCATION, to overwrite the default location of Hive in External table, 

Load the Data in Hive Table:
Data can be loaded in 2 ways in Hive Table:

  1. From local File to Hive or 
  2. From HDFS to Hive.

2)From local File to Hive 

Syntax:
LOAD DATA LOCAL INPATH 'path' INTO TABLE <Table_name>;

Example:
LOAD DATA LOCAL INPATH '/home/hadoop/Mano/Hive/student.txt' INTO TABLE STUDENT_EXTER;

Output:

hive> LOAD DATA LOCAL INPATH '/home/hadoop/Mano/Hive/student.txt' INTO TABLE STUDENT_EXTER;
Loading data to table test.student_inter
OK
Time taken: 1.015 seconds

2)From HDFS to Hive

Syntax:
LOAD DATA INPATH 'path' INTO TABLE <Table_name>;

Example:
LOAD DATA INPATH '/Hive_test/student.txt' INTO TABLE STUDENT_EXTER;

Output:
hive> LOAD DATA INPATH '/Hive_test/student.txt' INTO TABLE STUDENT_EXTER;

Drop table:
On dropping the external table, the data does not get deleted from HDFS. Thus it is evident that the external table are just pointers on HDFS data.

Example:
hive> select * from student_exter;



No comments:

Post a Comment

Fundamentals of Python programming

Fundamentals of Python programming: Following below are the fundamental constructs of Python programming: Python Data types Python...