Search code examples
jsonsql-serverrecursioncommon-table-expression

SQL Server Recursive CTE to shred JSON: Types don't match between the anchor and the recursive part


I am trying to rewrite a function to an inline Table Valued Fuction. Database is SQL Server 2022. Please have a look at the following SQL Fiddle. I tried multiple rewrites with explicit CASTs but it still wouldn't work for the key from the OPENJSON function. Running the query without that column however works as expected.

Can you please give any insight why this error occurs and how to fix it?

Thanks to @siggemannen for the solution: The collation of the key has to be changed as well.Here's a SQL Fiddle to show the working solution.

Here's the example as code:

  DECLARE @json NVARCHAR(MAX) =N'{"menu": {  
  "id": "file",  
  "value": "File",  
  "popup": {  
    "menuitem": [  
      {"value": "New", "onclick": "CreateDoc()"},  
      {"value": "Open", "onclick": "OpenDoc()"},  
      {"value": "Save", "onclick": "SaveDoc()"}  
    ]  
  }  
}}  '
;
/* code adapted from Phil Factor 
https://www.red-gate.com/simple-talk/databases/sql-server/t-sql-programming-sql-server/importing-json-web-services-applications-sql-server/
*/
WITH cteRecurseJSON AS
(
    SELECT 
        CAST(1 AS INTEGER) AS Depth, 
        --CAST('$' AS NVARCHAR(4000)) AS ThePath,
        CAST(N'' AS NVARCHAR(4000)) AS ThePath,
        CAST(@json  AS NVARCHAR(MAX)) AS TheValue, 
        CAST('object' AS VARCHAR(10)) AS ValueType
    UNION ALL
    SELECT 
        r.Depth+1 AS Depth,
            CAST(v.[Key] AS NVARCHAR(4000)) AS ThePath,
        Coalesce(Value,'') AS TheValue,
        CAST(CASE Type WHEN 1 THEN 'string'
                WHEN 0 THEN 'null'
                WHEN 2 THEN 'int'
                WHEN 3 THEN 'boolean'
                WHEN 4 THEN 'array' ELSE 'object' END AS VARCHAR(10)) AS ValueType
    FROM cteRecurseJSON r
    CROSS APPLY OPENJSON(r.TheValue) v
    WHERE r.ValueType IN ('array','object')

)
SELECT *
FROM cteRecurseJSON

Solution

  • The value ThePath in the anchor part needs to be nvarchar(max) and you need to change the collation as noted by @siggemanen.

    You should also use STRING_ESCAPE for correct escaping of paths. And you can simplify the CASE by just checking for an array.

    WITH cteRecurseJSON AS
    (
        SELECT 
            CAST(1 AS INTEGER) AS Depth, 
            CAST(N'$' AS NVARCHAR(max))  COLLATE database_default AS ThePath,
            CAST(jsonValue  AS NVARCHAR(MAX)) AS TheValue, 
            CAST('object' AS VARCHAR(10)) AS ValueType
        FROM test
        UNION ALL
        SELECT 
            r.Depth+1 AS Depth,
            CAST(
              r.ThePath +
              CASE WHEN r.ValueType = 'array' THEN '[' + v.[key] + ']'
                   WHEN STRING_ESCAPE(v.[key], 'json') <> v.[key] THEN '."' + STRING_ESCAPE(v.[key], 'json') + '"' --got a space in it
                   ELSE '.' + v.[Key]
                END
              AS  NVARCHAR(MAX)) AS ThePath,
            Coalesce(Value,'') AS TheValue,
            CAST(CASE Type
                    WHEN 1 THEN 'string'
                    WHEN 0 THEN 'null'
                    WHEN 2 THEN 'int'
                    WHEN 3 THEN 'boolean'
                    WHEN 4 THEN 'array'
                    ELSE 'object' END AS VARCHAR(10)) AS ValueType
        FROM cteRecurseJSON r
        CROSS APPLY OPENJSON(r.TheValue) v
        WHERE r.ValueType IN ('array', 'object')
    )
    SELECT *
    FROM cteRecurseJSON;
    

    db<>fiddle