Difference between --warehouse-dir and --target-dir commands in sqoop
Below parameter points to default hive table location.It can be used for dev purpose, where you just want to perform some tests on internal tables.
--warehouse-dir
Below parameter points to some hdfs location, where you can mount external hive tables.This is useful in production environment, where you want every data to be available to some external dir and external table.
--target-dir
--warehouse-dir
generally you use this option when you're importing all the tables with import-all-tables tool using sqoop. This directory can be anything, either your hive /data/warehouse directory or some other parent directory. All the tables will be imported in this parent directory.
--target-dir
This option is used when you've to import a single table using import-table tool. For each table you've to mention the directory and it must not already exist in the path.
As I got in case of import:
--warehouse-dir : It create a directory which works as database directory (sqoop_db_movies) and table name (as given in import command) directory automatically created with imported files with in warehouse dir(database directory).
Example: sqoop import --options-file /home/cloudera/sqoop/conn --table movies --warehouse-dir /sqoop_db_movies -m 1
Output as:
/sqoop_db_movies/movies
/sqoop_db_movies/movies/_SUCCESS
/sqoop_db_movies/movies/part-m-00000
--target-dir: It create a directory which work as table name (sqoop_table_movies) with imported files.
Example: sqoop import --options-file /home/cloudera/sqoop/conn --table movies --target-dir /sqoop_table_movies -m 1
Output as:
/sqoop_table_movies/_SUCCESS
/sqoop_table_movies/part-m-00000