Search code examples
jsonteradatateradatasql

teradata: JSON_TABLE to shred array of numbers


On the Teradata database (ver.17), I would like to shred a JSON object as below:

{
  "products": [{"category":"car", "name":"toyota"},
               {"category":"aircraft", "name":"boeing"},
               {"category":"fruit","name":"pear"}],
  "prices": [500, 100000, 1]
}

Here, products and prices arrays are paired; toyota's price is 500, boeing is 100000, and pear is 1. My goal is to parse this into the following form:

id  category    name   price
----------------------------
 1       car  toyota     500
 1  aircraft  boeing  100000
 1     fruit    pear       1 

My approach is to use JSON_TABLE to parse products and prices arrays separately and join afterwards.

I could shred the products part using JSON_TABLE function, but I got stuck for the prices part.

What I have so far is below:

/* create temp table for demo */
CREATE MULTISET VOLATILE TABLE test AS (
SELECT 1 AS id, NEW JSON('{
  "products":[{"category":"car","name":"toyota"},
              {"category":"aircraft","name":"boeing"},
              {"category":"fruit","name":"pear"}],
  "prices":[500,100000,1]}') AS doc
)
WITH DATA
ON COMMIT PRESERVE ROWS
;

/* working shredding for products part */
SELECT * FROM JSON_Table (
  ON (SELECT id, doc FROM test)
  USING rowexpr('$.products[*]')
        colexpr('[ {"jsonpath" : "$.category", "type" : "CHAR(20)"},
                   {"jsonpath" : "$.name", "type" : "VARCHAR(20)"},
                   {"ordinal" : true} ]')
) AS JT(id, category, name, ord)
;

/* failing for prices parts, just get NULLs */
SELECT * FROM JSON_Table (
  ON (SELECT id, doc FROM test)
  USING rowexpr('$.prices[*]')
        colexpr('[ {"jsonpath" : "$", "type" : "INTEGER"},
                   {"ordinal" : true} ]')
) AS JT(id, price, ord)
;

The results are below. The output from prices part is not something intended.

+----+------------------------------------------+--------+-----+
| id |                 category                 |  name  | ord |
+----+------------------------------------------+--------+-----+
| 1  | car                                      | toyota |  0  |
| 1  | aircraft                                 | boeing |  1  |
| 1  | fruit                                    |  pear  |  2  |
+----+------------------------------------------+--------+-----+


+----+-------+-----+
| id | price | ord |
+----+-------+-----+
| 1  |  None |  1  |
| 1  |  None |  1  |
| 1  |  None |  1  |
+----+-------+-----+

Can someone advise me on how I can shred the prices part of the JSON, preferably using JSON_TABLE?

Other approaches to achieve the same goal will also be appreciated.

I suppose UNPIVOT syntax could work, but I think it is not efficient because in reality I have much longer sequences and UNPIVOT requires I make a very wide table first (I could be wrong). But if UNPIVOT is actually a good solution for my problem, please let me know that as well.


Solution

  • I don't know if JSON_Table can be used to split a value array, but the result looks fishy: ord is always 1 instead of 0,1,2.

    JSON_Shread works as expected:

    SELECT * 
    FROM TD_JSONSHRED(
    ON
     (
       SELECT id, doc
       FROM test
     )
    USING
       ROWEXPR('prices')
       COLEXPR('')
       RETURNTYPES('INTEGER')
    ) dt
    ;
    

    But there's no ordinal returned.

    This returns the expected result:

    WITH cte AS 
     (
       SELECT id
         ,doc.prices[*] AS prices
       FROM test
     )
    SELECT id, ord, trycast(token AS INTEGER) 
    FROM TABLE
     ( STRTOK_SPLIT_TO_TABLE (cte.id, cte.prices, '[],')
       RETURNS (id integer, ord integer, token varchar(20)) 
     ) AS d
    ;