Search code examples
hadoophdfsavroavro-tools

How to convert existing text data in hdfs to Avro?


I have a table in hdfs which is stored in Text format, so now i have a requirement to add new column in between. So I thought to load new columns in avro as Avro supports schema evolution,but now the previous data is still in text format.


Solution

  • if you already have a table you can load that directly into avro table from hive, if not you can create hive table for that text file and load that to avro table. Something like

    create table test(fields type)  row format delimited fields terminated by ',' stored as textile location 'textfilepath';
    create table avrotbl like test stored as avrofile;
    insert into abrotbl select * from test;