Search code examples
hadoophiveetlhiveqlparquet

Unable to load data into parquet file format?


I am trying to parse log data into parquet file format in hive , the separator used is "||-||". The sample row is "b8905bfc-dc34-463e-a6ac-879e50c2e630||-||syntrans1||-||CitBook"

After performing the data staging I am able to get the result

"b8905bfc-dc34-463e-a6ac-879e50c2e630 syntrans1 CitBook ".

While converting the data to parquet file format I got error : `

Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2185)
        at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:137)
        at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:297)
        ... 24 more

This is what I have tried

create table log (a String ,b String ,c String)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES (
    "field.delim"="||-||",
    "collection.delim"="-",
    "mapkey.delim"="@"
);
create table log_par(
a String ,
b String ,
c String
) stored as PARQUET ;
insert into  logspar select * from log_par ;

`


Solution

  • Aman kumar,

    To resolve this issue, run the hive query after adding the following jar:

    hive> add jar hive-contrib.jar;
    

    To add the jar permanently, do the following:

    1.On Hive Server host, create a /usr/hdp//hive/auxlib directory.

    2.Copy /usr/hdp//hive/lib/hive-contrib-.jar to /usr/hdp//hive/auxlib.

    3.Restart the HS2 server.

    Please check further reference.

    https://community.hortonworks.com/content/supportkb/150175/errororgapachehadoophivecontribserde2multidelimits.html.

    https://community.hortonworks.com/questions/79075/loading-data-to-hive-via-pig-orgapachehadoophiveco.html

    Let me know,if you face any issues