Search code examples
c#jsonu-sqljsonparser

U-SQL ONLY extract specific data from JSON


I have a JSON file which contains an array of objects

{
  [ 
    {id:1, a:4},
    {id:2, a:7},
    {id:3, a:5},
    {id:4, a:9}
  ]    
}

How can i write a U-SQL script to extract only the first object {id:1, a:4}

Couldn't find on the docs of the JSON Assembly if JsonExtractor actually took as an argument a JSONPath string, but it does take a string and for what a read on other posts, its a JSONPath like string

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

USING Microsoft.Analytics.Samples.Formats.Json;

DECLARE @InputPath string = "filepath.json"

@RawData = 
    EXTRACT id     int,
            a      int
    FROM @InputPath
    USING new JsonExtractor("$[:1]");

NOTE: This objects are not the real objects. My requirements are to get the first of the array. There is no possible where field to match


Solution

  • Found the answer.

    Turns out that JsonExtract does receive a string parameter which is a JSONPath, but when you use a file like shown, where the first element is an unnamed array, it assumes that each object is a different JSON.

    Here you can find the issue that states the problem


    To my concern, you have 2 ways out:

    Implement your own extractor that handles this case.

    or

    Rebuild your input JSON if you can, so as to name the array.

    {
      data: [ 
        {id:1, a:4},
        {id:2, a:7},
        {id:3, a:5},
        {id:4, a:9},
      ]
    }
    

    and then you can use a JsonPath to extract a specific part of the json like so

    @RawData = 
      EXTRACT id     int,
              a      int
      FROM @InputPath
      USING new JsonExtractor("$.data[0]");
    

    Getting as a result id: 1, a:4