Search code examples
visual-studioazureazure-data-lakeu-sql

VS 2017 - USQL - Parse pipe delimited file


I'm trying to parse (convert single column with pipes to multiple string columns AND also assign variable types to each of the resulting columns) a pipe-delimited file in USQL / Azure using VS 2017.

The file has been loaded into a folder already but when I look at it, it's just a single string column with pipes visible in the data.

Column_0
fields1|field2|field3
123|456|abc

I have a couple scripts I'm trying to copy / tweak but I'm getting a bunch of errors that aren't very informative.

My script looks like this:

@input = 
    EXTRACT [Column_0] string
    FROM "adl://~/raw/file1.txt"
    USING Extractors.Tsv(skipFirstNRows:1, quoting: false, silent: true);

@res1 =
    SELECT 
                [field1] string,
                [field2] string,
                [field3] string
    FROM @input;

OUTPUT @res1   
    TO "adl://~/processed/file1.csv"    
    USING Outputters.Csv(quoting: false);

Any suggestions?


Solution

  • Use Text Extractor, and specify "|" as delimiter.

    @data = 
        EXTRACT
            field1 string,
            field2 string,
            field3 string
        FROM "/input1.txt"
        USING Extractors.Text(skipFirstNRows:1, quoting: false, silent: true, delimiter: '|');
    
    
    OUTPUT @data
        TO "/output1.csv"    
        USING Outputters.Csv(quoting: false);