Search code examples
jsonsql-servert-sqlsql-server-2019

Remove values from JSON array based on values in another table


I need to remove some values in a JSON array if they match with values from another table, if match id and pax:

Here is some data

declare @T1 Table (id int ,pax int ,rooms nvarchar(512))
declare @T2 Table (id int ,pax int ,rooms nvarchar(512))

insert into @T1 (id,pax,rooms)
VALUES (1,3,'["I23","I31","I1","I5","I15","B2","I14","I4","I11","I3","I15","I5","I15","I15","I31","I31","I21","I35","B5-B3","I5-I15","I13","I6-I5-I4-I3","B6","I36-I26-I16-I6"]'),
(1,4,'["I1-I11","I21-I31","I24-I34","B3-B5","B4-B6","I25-I35","I4-I5","I6-I16","I26-I36","I15-I25","I14-I24","I14-I4","I23-I31","I15-I5","I6-I16","I35-I25","I16-I6","I26-I36","B3-B5","B1-B3","I36-I26","I23-I13","B3","I3-I4"]'),
(2,5,'["I4-I5","I1-I11","I21-I31","B4-B6","I15-I25","I6-I16","I26-I36","I31-I23","I23-I13","I11-I21","I35-I25","B5-B3","I14-I4","I24-I34","B3-B1","I5-I15","B2-B4","I24-I14","I5-I3"]')

insert into @T2 (id,pax,rooms)
VALUES (1,3,'["I24-I34-I25-I35-I26-I36"]'), -- not exists
(1,3,'["I36-I26-I16-I6"]'),
(1,3,'["I6-I5-I4-I3"]'),
(1,3,'["I6-I16"]'), --not exists
(1,3,'["B6"]'),
(1,3,'["I13"]'),
(1,4,'["B1-B3"]'),
(1,4,'["B3"]'),
(1,4,'["I3-I4"]'),
(2,5,'["I5-I3"]')

So I need to remove from @T1 with id = 1 and pax=3 the values "I24-I34-I25-I35-I26-I36","I36-I26-I16-I6","I6-I5-I4-I3","I6-I16","B6","I13" (if exists)

Therefore the expected value will be

["I23","I31","I1","I5","I15","B2","I14","I4","I11","I3","I15","I5","I15","I15","I31","I31","I21","I35","B5-B3","I5-I15"]

I tried to use the few JSON functions available in SQL Server but clearly my skills are not enough to find the right solution

Here is my attempt:

SELECT  t1.rooms,
        FilteredArray = JSON_QUERY(CONCAT('[', STRING_AGG(QUOTENAME(oj.Value,'"'), ','), ']'))
FROM    @T1 AS t1
        CROSS APPLY 
        (   SELECT  oj.Value
            FROM    OPENJSON(t1.rooms) AS oj
            join @T2 as t2 on t2.id=t1.id and t2.pax=t1.pax
            where t1.rooms<>t2.rooms
        ) AS oj
        group by t1.rooms

But instead of removing values it concatenate 3 times the @T1 value Can someone suggest the path to follow to achieve the target?


Solution

  • Using Yitzhak's data, perhaps something like:

    SELECT id, pax
        , FilteredArray = JSON_QUERY(CONCAT('[', STRING_AGG(QUOTENAME(oj.Value,'"'), ',') WITHIN GROUP(ORDER BY CAST([key] AS INT)), ']'))
    FROM @T1 AS t1
    CROSS APPLY  OPENJSON(t1.rooms) AS oj
    WHERE NOT EXISTS(
        SELECT  1
        FROM    @T2 AS t2
        WHERE   TRIM('["]' FROM t2.rooms) = oj.Value
        AND t2.id = t1.id
        AND t2.pax = t1.pax
    )
    GROUP BY id, pax;
    

    Edit, duplicate removal:

    
    SELECT  id, pax
    ,   FilteredArray = JSON_QUERY(CONCAT('[', STRING_AGG(QUOTENAME(Value,'"'), ',') WITHIN GROUP(ORDER BY [key]), ']'))
    FROM    (
        SELECT  id, pax, value, MIN(CAST([key] AS int)) AS [key]
        FROM    @T1 AS t1
        CROSS APPLY  OPENJSON(t1.rooms) AS oj
        WHERE   NOT EXISTS(
            SELECT  1
            FROM    @T2 AS t2
            WHERE   TRIM('["]' FROM t2.rooms) = oj.Value
            AND t2.id = t1.id
            AND t2.pax = t1.pax
        )
        GROUP BY id, pax, value
        ) x
    GROUP BY id, pax;
    

    Outputs (with dups):

    id pax FilteredArray
    1 3 ["I23","I31","I1","I5","I15","B2","I14","I4","I11","I3","I15","I5","I15","I15","I31","I31","I21","I35","B5-B3","I5-I15"]
    1 4 ["I1-I11","I21-I31","I24-I34","B3-B5","B4-B6","I25-I35","I4-I5","I6-I16","I26-I36","I15-I25","I14-I24","I14-I4","I23-I31","I15-I5","I6-I16","I35-I25","I16-I6","I26-I36","B3-B5","I36-I26","I23-I13"]
    2 5 ["I4-I5","I1-I11","I21-I31","B4-B6","I15-I25","I6-I16","I26-I36","I31-I23","I23-I13","I11-I21","I35-I25","B5-B3","I14-I4","I24-I34","B3-B1","I5-I15","B2-B4","I24-I14"]

    Without dupes:

    id pax FilteredArray
    1 3 ["I23","I31","I1","I5","I15","B2","I14","I4","I11","I3","I21","I35","B5-B3","I5-I15"]
    1 4 ["I1-I11","I21-I31","I24-I34","B3-B5","B4-B6","I25-I35","I4-I5","I6-I16","I26-I36","I15-I25","I14-I24","I14-I4","I23-I31","I15-I5","I35-I25","I16-I6","I36-I26","I23-I13"]
    2 5 ["I4-I5","I1-I11","I21-I31","B4-B6","I15-I25","I6-I16","I26-I36","I31-I23","I23-I13","I11-I21","I35-I25","B5-B3","I14-I4","I24-I34","B3-B1","I5-I15","B2-B4","I24-I14"]

    EXISTS allows more flexibility to handle things that should or should not exist. I also added the ORDER BY for the STRING_AGG to get correct sequence