Search code examples
snowflake-cloud-data-platformflattenvariant

Can't query semi-structured data using lateral flatten etc


I've got some data in a table, and one of the columns is a Variant which contains a ree of JSON data. I can successfully flatten arrays, and arrays within arrays to access data therein but I'm struggling with flattening key-value pairs to access the value for a given key.

I've seen the docs at https://docs.snowflake.net/manuals/user-guide/json-basics-tutorial.html mapping this onto my use case results in NULL values in the results.

My variant is show in part below - In particular it's values like MatchStatus and the key/values under Variables that I'm interested in extracting.

Thanks for any helpful suggestions.

enter image description here


Solution

  • The described JSON has a simple path-like structure with objects at various levels (and no arrays).

    Per Snowflake's semi-structured data documentation, use the dot notation to extract a value following a (flatly nested) path:

    Insert a colon : between the VARIANT column name
    and any first-level element: <column>:<level1_element>.
    
    Use dot notation to traverse a path in a JSON object:
    <column>:<level1_element>.<level2_element>.<level3_element>.
    

    An example would be (note the chained use of dots in the third and fourth lines):

    SELECT
      badminton_odds:Id as id,
      badminton_odds:PricingRequest.MatchStatus as match_status,
      badminton_odds:PricingRequest.Variables.Dispersion as var_dispersion
    FROM odds_table
    

    You do not require FLATTEN for simple, singular value extraction. Use FLATTEN when you have a need to explode some series data into multiple rows (such as in case of arrays).

    For example, if the described JSON in the question is how a single array element looks in a long array of such objects, you may use FLATTEN to first break the whole array into rows, and then apply path style extraction to retrieve the value from each row.