Tuesday 5 September 2017

5)Sqoop Tool2: sqoop-import-all-tables

Sqoop Tool2: sqoop-import-all-tables:

The import-all-tables tool imports a set of tables from an RDBMS to HDFS. Data from each table is stored in a separate directory in HDFS.

Below conditions for import-all-tables tool to be useful:
  • Each table must have a single-column primary key or --autoreset-to-one-mapper option must be used.
  • You must intend to import all columns of each table.
  • You must not intend to use non-default splitting column, nor impose any conditions via a WHERE clause.
Syntax:
$ sqoop import-all-tables (generic-args) (import-args)
or
$ sqoop-import-all-tables (generic-args) (import-args)
1)Import control arguments: 
Argument Description
--as-avrodatafile Imports data to Avro Data Files
--as-sequencefile Imports data to SequenceFiles
--as-textfile Imports data as plain text (default)
--as-parquetfile Imports data to Parquet Files
--direct Use direct import fast path
--inline-lob-limit <n> Set the maximum size for an inline LOB
-m,--num-mappers <n> Use n map tasks to import in parallel
--warehouse-dir <dir> HDFS parent for table destination
-z,--compress Enable compression
--compression-codec <c> Use Hadoop codec (default gzip)
--exclude-tables <tables> Comma separated list of tables to exclude from import process
--autoreset-to-one-mapper Import should use one mapper if a table with no primary key is encountered

Exclude tables arugemnt:

Command:

mano@Mano:~$ sqoop-import-all-tables --connect jdbc:mysql://localhost/sqoop_test --username root --password root --autoreset-to-one-mapper --exclude-tables employees --warehouse-dir '/MANO/Sqoop_import_all_tables/'

Note:
Invalid arguments for sqoop-import-all-tables tool:
  • --table,
  • --split-by,
  • --columns,
  • --where
  • and --target-dir
All other arguments as same as sqoop-import tool as we did😊
2)Output line formatting arguments:  
Argument Description
--enclosed-by <char> Sets a required field enclosing character
--escaped-by <char> Sets the escape character
--fields-terminated-by <char> Sets the field separator character
--lines-terminated-by <char> Sets the end-of-line character
--mysql-delimiters Uses MySQL’s default delimiter set: fields: , lines: \n escaped-by: \ optionally-enclosed-by: '
--optionally-enclosed-by <char> Sets a field enclosing character

3)Input parsing arguments:

Argument Description
--input-enclosed-by <char> Sets a required field encloser
--input-escaped-by <char> Sets the input escape character
--input-fields-terminated-by <char> Sets the input field separator
--input-lines-terminated-by <char> Sets the input end-of-line character

4)Hive arguments: 

Argument Description
--hive-home <dir> Override $HIVE_HOME
--hive-import Import tables into Hive (Uses Hive’s default delimiters if none are set.)
--hive-overwrite Overwrite existing data in the Hive table.
--create-hive-table If set, then the job will fail if the target hive


table exits. By default this property is false.
--hive-table <table-name> Sets the table name to use when importing to Hive.
--hive-drop-import-delims Drops \n, \r, and \01 from string fields when importing to Hive.
--hive-delims-replacement Replace \n, \r, and \01 from string fields with user defined string when importing to Hive.
--hive-partition-key Name of a hive field to partition are sharded on
--hive-partition-value <v> String-value that serves as partition key for this imported into hive in this job.
--map-column-hive <map> Override default mapping from SQL type to Hive type for configured columns.

5)Code generation arguments::

Argument Description
--bindir <dir> Output directory for compiled objects
--jar-file <file> Disable code generation; use specified jar
--outdir <dir> Output directory for generated code
--package-name <name> Put auto-generated classes in this package

Command: using --bindir
mano@Mano:~$ sqoop-import-all-tables --connect jdbc:mysql://localhost/sqoop_test --username root --password root --autoreset-to-one-mapper --exclude-tables employees --warehouse-dir '/MANO/Sqoop_import_all_tables/test1' --bindir '/home/mano/students'

 Command: using --outdir
mano@Mano:~$ sqoop-import-all-tables --connect jdbc:mysql://localhost/sqoop_test --username root --password root --autoreset-to-one-mapper --exclude-tables employees --warehouse-dir '/MANO/Sqoop_import_all_tables/test2' --outdir '/home/mano/students1/'
Command: using --package-name
mano@Mano:~$ sqoop-import-all-tables --connect jdbc:mysql://localhost/sqoop_test --username root --password root --autoreset-to-one-mapper --exclude-tables employees --warehouse-dir '/MANO/Sqoop_import_all_tables/test3' --package-name students.jar 
Note:
The import-all-tables tool does not support the --class-name argument.however, specify a package with --package-name in which all generated classes will be placed.


Please follow the link for further ==>Sqoop_Page6

No comments:

Post a Comment

Fundamentals of Python programming

Fundamentals of Python programming: Following below are the fundamental constructs of Python programming: Python Data types Python...