Search code examples
azure-synapseparallel-data-warehouse

Escaping quotes in Polybase with STRING_DELIMITER


I've got a pipe delimited flat file I want to load with Polybase. String columns are quoted with double quotes and if there are double quotes in the value, they are doubled. I believe that's a valid file format:

 SizeDescription|SizeCode
 "3.5 feet"|"3.5 ft"
 "2.5 inches"|"2.5"""

If I don't specify STRING_DELIMITER then the quotes show up in my table I'm inserting to. If I do specify STRING_DELIMITER='"' then it works properly for the first row (3.5 ft) but fails on the second row (2.5").

Am I doing something wrong? Or is this not supported?


Solution

  • The behavior is correct. If you don't specify quote (") as the STRING_DELIMITER, the parser is splitting on the pipe as you expect. If you specify the quote(") as the STRING_DELIMITER, you end up with 3 columns for the first row:

    {3.5 feet} {|} {3.5 ft}

    And 4 columns for the second row:

    {2.5 inches} {|} {2.5} {}

    Polybase will fail with the number of columns mismatch.