Search code examples
sqlvisual-studioazureazure-data-factoryu-sql

U-SQL: how to deal with schema changes?


My original script is something like this:

@input = EXTRACT A string, B string, C string, 
         year string, month string, day string, filename string
    FROM @folder + "/{year}/{month}/{day}/{filename}.csv"
    USING Extractors.Csv(skipFirstNRows : 1);

@input = SELECT A, B, C FROM @input;

OUTPUT @input
    TO @parent + "/testtest.csv"
    USING Outputters.Csv(outputHeader : true);

This works fine, but sometimes the schema (columns) of the source file may change. The columns may become A, B, C, D or A, B, E.

I know Visual Studio can generate EXTRACT scripts. Is there a way to make U-SQL (or Visual Studio) deal with this and generate the extraction script dynamically and automatically?


Solution

  • The Csv extractor not allow schema changes. If you change the schema you will need to change your u-sql code!

    The solution is create a custom extractor to do your job, or you can check flexible extractor that allows flexible columns schema.

    https://blogs.msdn.microsoft.com/mrys/2016/08/15/how-to-deal-with-files-containing-rows-with-different-column-counts-in-u-sql-introducing-a-flexible-schema-extractor/