Based on this questions and the perfect answer given there, I now need to use this approach to UPDATE a table column.
Table1 (data table)
textWithFoundItemInIt | updated_foundItem
-------------------------------------------
hallo Adam | nothing in here yet
Bert says hello | ....
Bert says byebye |
Want to find "Caesar"bdjehg |
Want to find "Caesar"bdjehg |
Want to find "Caesar"again |
Table2 (mapping table)
mappingItem
------------
Adam
Bert
Caesar
Bert
Caesar
Adam
Expected result UPDATED
textWithFoundItemInIt | updated_foundItem
--------------------------------------------
hallo Adam | Adam
Bert says hello | Bert
Bert says byebye | Bert
Want to find "Caesar"bdjehg | Caesar
Want to find "Caesar"bdjehg | Caesar
Want to find "Caesar"again | Caesar
Want to find "Caesar"again and also Bert | Caesar [or Bert]
Want to find "CaesarCaesar"again and again | Caesar
My query: Based on this answer, I came up with this Update-Statement:
UPDATE Table1
SET updated_foundItem = foundItem
FROM
(select DISTINCT textWithFoundItemInIt,
regexp_extract(textWithFoundItemInIt, r'(?i)' || mappingItems) AS foundItem
from table1, (select string_agg(mappingItem, '|') mappingItems from table2))
WHERE ... ???
But I have no clue how the correct WHERE-statement should look like. Unfortunately I cannot test many queries in bigQuery, as my dataset is very large and even making a sample for testing is currently out of scope.
Thanks a lot for your help.
I will provide the answer to the SELECT problem. A translation to the UPDATE statement can be applied directly from the answer.
Solution 1: One record per match.
If there is more than one match, the results will have more than one row matching field in table one with all matched fields in table two. One per row.
WITH Table_One AS (
SELECT 'hallo Adam Caesar' AS textWithFoundItemInIt UNION ALL
SELECT 'Jim' UNION ALL
SELECT 'Bert says hello' UNION ALL
SELECT 'Bert says byebye' UNION ALL
SELECT 'Want to find "Caesar"bdjehg' UNION ALL
SELECT 'Want to find "Caesar"bdjehg' UNION ALL
SELECT 'Want to find "Caesar"again'
),
Table_Two AS (
SELECT 'Adam' AS mappingItem UNION ALL
SELECT 'Bert' UNION ALL
SELECT 'Caesar'
)
SELECT
Table_One.textWithFoundItemInIt,
Table_Two.mappingItem
FROM Table_One
INNER JOIN Table_Two ON
Table_One.textWithFoundItemInIt Like CONCAT('%', Table_Two.mappingItem, '%');
Solution 2: Array struct for all matches.
So, one single row per value in table one and an array with all matches found in table two.
WITH Table_One AS (
SELECT 'hallo Adam Caesar' AS textWithFoundItemInIt UNION ALL
SELECT 'Jim' UNION ALL
SELECT 'Bert says hello' UNION ALL
SELECT 'Bert says byebye' UNION ALL
SELECT 'Want to find "Caesar"bdjehg' UNION ALL
SELECT 'Want to find "Caesar"bdjehg' UNION ALL
SELECT 'Want to find "Caesar"again'
),
Table_Two AS (
SELECT 'Adam' AS mappingItem UNION ALL
SELECT 'Bert' UNION ALL
SELECT 'Caesar'
)
SELECT
Table_One.textWithFoundItemInIt,
ARRAY_AGG(Table_Two.mappingItem) AS mappingItem FROM Table_One
INNER JOIN Table_Two ON
Table_One.textWithFoundItemInIt Like CONCAT('%', Table_Two.mappingItem, '%')
GROUP BY Table_One.textWithFoundItemInIt;
Since you don't specify how the output should look like (several rows or one single row per key in table one), you need to choose the best solution for your use case and rewrite the select to perform the update.