i'm loading a json file through kafka connector, so the connector is creating an internal stage, pipe and destination table with multiple VARIANT data type columns.
The json has a specific attributes that are created like this:
"data_1_attribute": [
{
"data_1_id": {
"data_1": "1111",
"data_2": 1
},
"data_2_id": null
},
...
],
"data_2_attribute": {
"data_3_id": 111111,
"data_4_id": "XXXXX",
"data_5_id": [
{
"data_3": 111
},
{
"data_4": 111
}
],
"data_6_id": [
{
"data_5": 222,
"data_6": "YYY"
}
],
"data_7_id": [
{
"data_7": 333,
"data_8": "ZZZ"
},
...
]
}
The main difference between data_1_attribute and data_2_attribute is that 1 is identified as a single object ("data_2_attribute": { ) and this is understood in a correct way by the kafka connector, but the first attribute ("data_1_attribute": [ ) i think is understood as an array of objects and when the VARIANT column is create, shows me the data like this
[
"{\"data_1_id\":{\"data_2\":1,\"data_1\":"1111"},\"data_2_id\":null}"
]
Where if i do a lateral flatten the result in the VALUE column is:
"{\"data_1_id\":{\"data_2\":1,\"data_1\":"1111"},\"data_2_id\":null}"
And i don't know how to access and flatten this information.
I was trying to used an ARRAY_FLATTEN but i don't understand how to use it, any thoughts? Thanks
Try using PARSE_JSON
for record_content
column, this should fix your issue.
SELECT PARSE_JSON(record_content) ....