Where does hive stores its table?

If the table is 100GB you should consider an Hive External Table (as opposed to a "managed table", for the difference, see this).

With an external table the data itself will be still stored on the HDFS in the file path that you specify (note that you may specify a directory of files as long as they all have the same structure), but Hive will create a map of it in the meta-store whereas the managed table will store the data "in Hive".

When you drop a managed table, it drops the underlying data as opposed to dropping a hive external table which only drops the meta-data from the meta-store referencing that data.

Either way you are using only 100GB as viewed by the user and are taking advantage of the HDFS' robustness though duplication of the data.

Tags:

Hive