Search code examples
sqlsnowflake-cloud-data-platformvariantindirection

How to perform indirect references into a variant column?


I'm determining the structure/query I would need to handle indirect references into a variant column (i.e. I want to extract different fields for different rows based on the definitions of another table).

Example: I have variant data that represents different items each of which have completely different sets of properties based on a category they belong to. While these properties are named differently depending on the category, there are some common purposes that can be shared/identified.

So I have a table (properties) that defines the property name for each purpose/category combination

Properties

| Category | Purpose | Property_Name | Property_Data_Type |
|----------|---------|---------------|--------------------|
|    car   |   name  |  model        |    string          |
|    car   |  brand  |  make         |    string          |
|    car   |  price  |  invoice      |    number          |
|  phone   |   name  |  product      |    string          |
|  phone   |  brand  |  manufacturer |    string          |
|  phone   |  price  |  msrp         |    number          |
|----------|---------|---------------|--------------------|

and I have a table (items) that contains the information for each item in a variant field

Items

| Item_Id | Category | Properties                                                       |   
|---------|----------|------------------------------------------------------------------|
|     1   |   car    | {"make":"ford", "model":"focus", "invoice":18999}                |
|     2   |   car    | {"make":"audi", "model":"a5", "invoice":36487}                   |
|     3   | phone    | {"manufacturer":"apple", "product":"iphone 10", "msrp":679}      |
|     4   | phone    | {"manufacturer":"samsung", "product":"galaxy s20",  "msrp":1029} |
|---------|----------|------------------------------------------------------------------|

What I want to be able to do is to get a query (or a process to populate another table) that would extract the values from the json data based on the property purpose that is defined. So I would have a query that could give the following results

| Item_Id | Category | Name       | Brand   | Price |
|---------|----------|------------|---------|-------|
|    1    |  car     | focus      | ford    | 18999 |
|    2    |  car     | a5         | audi    | 36487 |
|    3    |  phone   | iphone 10  | apple   |   679 |
|    4    |  phone   | galaxy s20 | samsung |  1029 |
|---------|----------|------------|---------|-------|

Solution

  • I was bored so I wrote this. It's a stored procedure that will create or replace a view called ITEM_VIEW according to your description above. Note that there's a minor discrepancy in your data. The property for phone in the JSON is "model", but in the PROPERTIES table it's "manufacturer".

    Any time you change the PROPERTIES table, you'll need to run the stored procedure to rebuild the view. I put the SQL statement and clauses into templates so you can modify as required.

    As you can see in the code, the stored procedure is expecting a table named PROPERTIES and a table named ITEMS. If your actual table names are different, you can change them in the code and/or SQL templates where you find them.

    create or replace procedure CREATE_ITEM_VIEW()
    returns string
    language javascript
    as
    $$
    
    var nameClause = "";
    var brandClause = "";
    var priceClause = "";
    var category, purpose, dataType, property;
    
    var rs = GetResultSet("select * from PROPERTIES");
    
    while (rs.next()){
    
        category = rs.getColumnValue("CATEGORY");
        purpose  = rs.getColumnValue("PURPOSE");
        property = rs.getColumnValue("PROPERTY_NAME");
        dataType = rs.getColumnValue("PROPERTY_DATA_TYPE");
    
        if (purpose == 'name'){
            nameClause += GetColumn(category, property, dataType) + "\n";
        }
        if (purpose == 'brand'){
            brandClause += GetColumn(category, property, dataType) + "\n";
        }
        if (purpose == 'price'){
            priceClause += GetColumn(category, property, dataType) + "\n";
        }
    }
    
    var viewSQL = GetViewSQL(nameClause, brandClause, priceClause);
    
    return ExecuteSingleValueQuery("status", viewSQL);
    
    // ----------------------End of Main Function ---------------------------------------
    
    function GetColumn(category, name, dataType){
        var sql = "when '@~CATEGORY~@'    then PROPERTIES:@~NAME~@::@~NAME_DATA_TYPE~@";
        sql = sql.replace(/@~CATEGORY~@/g,       category);
        sql = sql.replace(/@~NAME~@/g,           name);
        sql = sql.replace(/@~NAME_DATA_TYPE~@/g, dataType);
        return sql;
    }
    
    function GetViewSQL(nameClause, brandClause, priceClause){
    
    var sql = `
    create or replace view ITEM_VIEW as
    select  ITEM_ID,
            CATEGORY, 
            case CATEGORY
    
    ${nameClause}
            end as "NAME",
            case CATEGORY
    
    ${brandClause}
            end as BRAND,
            case CATEGORY
    
    ${priceClause}
            end as PRICE
    from ITEMS;
    `;
    return sql;
    }
    
    function GetResultSet(sql){
        cmd1 = {sqlText: sql};
        stmt = snowflake.createStatement(cmd1);
        var rs;
        rs = stmt.execute();
        return rs;
    }
    
    function ExecuteSingleValueQuery(columnName, queryString) {
        var out;
        cmd1 = {sqlText: queryString};
        stmt = snowflake.createStatement(cmd1);
        var rs;
    
        rs = stmt.execute();
        rs.next();
        return rs.getColumnValue(columnName);
        return out;
    }
    
    $$;
    

    Once you create the stored procedure, run it like this and it will create your view:

    call create_item_view();
    
    select * from ITEM_VIEW;