Search code examples
rhiverjdbc

Error loading csv data into Hive table


i have a csv file in hadoop and i have a Hive table ,now i want to laoad that csv file into this Hive table

i have used load LOAD DATA local 'path/to/csv/file' overwrite INTO TABLE tablename;

ended up with this error :

Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ",  : 
Unable to retrieve JDBC result set for LOAD DATA local
'path/to/csv/file' overwrite INTO TABLE tablename 
(Error while processing statement: FAILED: 
ParseException line 1:16 missing INPATH at ''path/tp csv/file'' near '<EOF>'
)

Note: i am trying this using RJDBC connection in r


Solution

  • I have developed a tool to generate hive scripts from a csv file. Following are few examples on how files are generated. Tool -- https://sourceforge.net/projects/csvtohive/?source=directory

    1. Select a CSV file using Browse and set hadoop root directory ex: /user/bigdataproject/

    2. Tool Generates Hadoop script with all csv files and following is a sample of generated Hadoop script to insert csv into Hadoop

      #!/bin/bash -v
      hadoop fs -put ./AllstarFull.csv /user/bigdataproject/AllstarFull.csv hive -f ./AllstarFull.hive

      hadoop fs -put ./Appearances.csv /user/bigdataproject/Appearances.csv hive -f ./Appearances.hive

      hadoop fs -put ./AwardsManagers.csv /user/bigdataproject/AwardsManagers.csv hive -f ./AwardsManagers.hive

    3. Sample of generated Hive scripts

      CREATE DATABASE IF NOT EXISTS lahman;
      USE lahman;
      CREATE TABLE AllstarFull (playerID string,yearID string,gameNum string,gameID string,teamID string,lgID string,GP string,startingPos string) row format delimited fields terminated by ',' stored as textfile;
      LOAD DATA INPATH '/user/bigdataproject/AllstarFull.csv' OVERWRITE INTO TABLE AllstarFull;
      SELECT * FROM AllstarFull;

    Thanks Vijay