I can able to load data into hive using following command :
LOAD DATA INPATH '/xx/person/a.csv' INTO TABLE person PARTITION (age = 30);
in above statement , age=30
is partition where data has to be stored.
what if a.csv actually have the age column inside? Is there a way to get hive to correctly insert each line of a.csv into my person table under the right partition with one LOAD DATA statement?
LOAD DATA only support static partitioning: "When the LOAD DATA statement operates on a partitioned table, it always operates on one partition at a time."
INSERT, on the other hand, supports dynamic partitioning: "If a partition key column is mentioned but not assigned a value, [...] the unassigned columns are filled in with the final columns of the SELECT list."
So what you can do is define a table over the source data, optionally also define a view to move the partition columns to the final positions, and finally use insert into [...] select [...]
to populate the partitioned table from the view.