Search code examples
hadoophivehiveqlhdfstore

When executing LOAD DATA in Hive, does it copies the data?


When loading data stored in HDFS into HIVE, does this data from HDFS gets copied into a different format used by HIVE? Or does it uses the original files to store/select/insert/modify the data?

Context: LOAD DATA INPATH '/home/user/sample.txt' OVERWRITE INTO TABLE employee;

Does HIVE uses /home/user/sample.txt always to store/select/insert/modify the data, or does it creates a new file occupying new space at HDFS/HBASE?


Solution

  • It is explained in the documentation :

    If the keyword LOCAL is not specified, then Hive will either use the full URI of filepath, if one is specified, or will apply the following rules: [...] Hive will move the files addressed by filepath into the table (or partition)