SQL - Check how many times substrings within a column appear in within another column of strings

For my problem, I'm trying to find the category with the highest count of words within various sentences:

Table 1:

Word	Category
'Rain'	A
'He'	A
'France'	C
...	...
'Work'	B

Table 2:

ID	Sentence
1	'I'd like to go to France sometime.'
2	'He won't come in a rain like this.'
3	'They agreed to work together.'
...	...

Ideally I would like to create a table like this:

Result:

ID	Sentence	Category with highest word count
1	'I'd like to go to France sometime.'	C
2	'He won't come in a rain like this.'	A
3	'They agreed to work together.'	B
...	...	...

My plan was to create columns for each word and then aggregate them by their categories to find the category with the highest word count for each ID, but the code I am planning to use requires a lot of manual input. Is there a better way to do this?

select ID, 
Sentence
regexp_count(Assignee, 'Rain') + regexp_count(Assignee, 'He') + ... as CategoryA_Count
... as CategoryB_count
... as CategoryC_count
from Table 2```

Solution

This could work:

WITH
-- your input ...
words(word,category) AS (
          SELECT 'Rain'  ,'A'
UNION ALL SELECT 'He'    ,'A'
UNION ALL SELECT 'France','C'
UNION ALL SELECT 'Work'  ,'B'
)
,
-- your other input ...
sentences(id,sentence) AS (
          SELECT 1,'I''d like to go to France sometime.'
UNION ALL SELECT 2,'He won''t come in a rain like this.'
UNION ALL SELECT 3,'They agreed to work together.'
)
-- here goes the query ...
-- Each time a word out of "words" is found in "sentences", that
-- counts once. 
SELECT
  id  
, sentence
, category
FROM words
-- "Found" can be expressed as a join of "words" and
-- "sentences" that succeeds if that word is found in a case in-
-- sensitive match ( REGEXP_ILIKE() ) as a word match: "\b" means:
-- word boundary, hence we match with '\b'||word||'\b', so, for 
-- example: '\bHe\b', and not just 'He'. REGEXP_ILIKE() returns a 
-- Boolean: TRUE if a match is found.
JOIN sentences ON REGEXP_ILIKE(sentence,'\b'||word||'\b')
GROUP BY
  id  
, sentence
, category
-- we count the rows produced by the complex join
-- by id, sentence and category. 
-- we use Vertica's analytic limit clause that returns just the
-- first LIMIT <n> rows per partition in the PARTITION BY clause
-- to get the highest COUNT(*), we ORDER BY that value DESC(ending)
LIMIT 1 OVER(PARTITION BY sentence ORDER BY COUNT(*) DESC)
-- out  id |              sentence              | category 
-- out ----+------------------------------------+----------
-- out   1 | I'd like to go to France sometime. | C
-- out   2 | He won't come in a rain like this. | A
-- out   3 | They agreed to work together.      | B