Search code examples
csvhivedouble-quoteshive-serde

hive sql, serde how to not quote my fields?


Since by default serde quotes fields by ", How can I not quote my fields using serde?

I tried:

row format serde "org.apache.hadoop.hive.serde2.OpenCSVSerde"
with serdeproperties(
"separatorChar" = ",",
"quoteChar" = "")

But i'm getting

FAILED: SemanticException java.lang.StringIndexOutOfBoundsException: String index out of range: 0

Solution

  • You could achieve this by specifying \u0000 as the quote character. Since quoteChar expects a string, you should use this unicode version of NULL.

    ROW FORMAT SERDE
        "org.apache.hadoop.hive.serde2.OpenCSVSerde"
    WITH SERDEPROPERTIES (
        "separatorChar" = ",",
        "quoteChar" = "\u0000")
    

    This unicode NULL \u0000 is what used by the CSV writer class as value for NO_QUOTE_CHARACTER: http://www.java2s.com/Code/Java/Development-Class/AverysimpleCSVwriterreleasedunderacommercialfriendlylicense.htm