I have a nested json that I am trying to flatten in usql. I cannot share the data, but the structure is similar to this.
{
"userlist": [user1, user1],
"objects": {
"largeobjects": [object1, object2, object3]
"smallobjects": [s_object1, s_object2]
},
"applications": [{
"application": sdq3ds5dsa
}, {
"application": dksah122j4
}, {
"application": sadsw2dq2s
}, {
"application": pro3dfdsn3
}
],
"date" : 12344232,
"timezone" : "Asia",
"id" : "sad2ddssa2",
"admin": {
"lang": "eng",
"country": "us",
}
}
I am using the custom jsonoutputter (https://github.com/Azure/usql/tree/master/Examples/DataFormats/Microsoft.Analytics.Samples.Formats) to extract from the json file and the jsontuple function to extract the values. My problem is that the function uses sql map which generates key value pairs. This works for the situations where I have a key, but it throws an error when I try to use that function to get the values from the no key array.
Any suggestion regarding how to solve this would be greatly appreciated.
EDIT This is the output I am looking after:
sad2ddssa2, object1, 12344232, "Asia", "eng", "us",
sad2ddssa2, object2, 12344232, "Asia", "eng", "us"
First option
Try to use PROSE within your u-sql. Use the PROSE's c# nuget to process data and do complex extractions. This is a very powerful AI package. See the videos and examples here: https://microsoft.github.io/prose
Second option
Create a c# function to process your json. Something like this one, adapt this sample to your custom extraction request using c# json api's:
/* Formats the array of values into a named json array. */
DECLARE @JsonArray Func<SqlArray<string>, string, string> = (data, name) =>
{
StringBuilder buffer = new StringBuilder();
buffer.Append("{\r\n\t\"" + name + "\": [\r\n");
for (int i = 0; i < data.Count(); i++)
{
if (i > 0)
{
buffer.Append(",\r\n");
}
buffer.Append("\t\"" + data[i] + "\"");
}
buffer.Append("\r\n\t]\r\n}");
return buffer.ToString();
};
/* Format the array containing groups of comma separated values into a named json array */
@Query =
SELECT
@JsonArray(SubscriptionArray, "subscriptionList") AS JsonArray
FROM @subscriptionsQuery1;
Third option
Try this approach, after adapting it to your needs:
/* For each json line create a json map (SqlMap) */
@subscriptionsQuery1 =
SELECT
JsonFunctions.JsonTuple(JsonLine) AS JsonMap
FROM @SubscriptionsExtractor AS t;
/* For each json map get the required property value */
@subscriptionsQuery1 =
SELECT DISTINCT
JsonMap["alias"] AS Subscription
FROM @subscriptionsQuery1 AS t;
/* Join the value of all rows into a single row containing an array of all values */
@subscriptionsQuery1 =
SELECT
ARRAY_AGG<string>(Subscription) AS SubscriptionArray
FROM @subscriptionsQuery1 AS t;