Search code examples
sqlstringfindpositionteradata

Identify the last occurance of a subsring in a string, where the substring is from a table. Teradata


I have the following problem:

I need to identify the last occurrence of any sub-string given in table A, and return that given value in return in the select statement of another statement. This is a bit convoluted, but here is the code:

SELECT TRIM(COUNTRY_CODE)
FROM (
    SELECT TOP 1 POSITION( PHRASE IN MY_STRING) AS PHRASE_LOCATION, CODE 
    FROM REFERENCE_TABLE -- Where the country list is located
    WHERE PHRASE_LOCATION > 0 -- To return NULL if there is no matches
    ORDER BY 1 DESC -- To get the last one 
    ) t1

This works when run by it self, but i have large problems getting it to work as part of another queries' select. I need "MY_STRING" to come from a higher level in the nested select three. The reasons for this is how the system is designed on a higher level.

In other words i need the following:

  • PHRASE is coming from a table that have a phrases and a code associated
  • MY_STRING is used in the higher level select and i need to associate a code with it, based on the last occurring phrase
  • Number of different phrases > 400 so no hard coding :(
  • Number of different "MY_STRING" > 1 000 000 / day

So far i tried what you can see above, but due to the constraints of the system, i cannot be to creative.

Example Phrases: "New York", "London", "Oslo"
Example Codes: "US", "UK, "NO"
Example Strings: "London House, Something street, New York"; "Some street x, 0120, OSL0". 
Desired Outcomes: "US"; "NO"

Solution

  • This will result in a product join, i.e. use a lot of CPU:

    SELECT MY_STRING
      -- using INSTR searching the last occurance instead of POSITION if the same PHRASE might occur multiple times
      -- INSTR is case sensitive -> must use LOWER 
      ,Instr(Lower(MY_STRING), Lower(PHRASE), -1, 1) AS PHRASE_LOCATION
      ,CODE 
      ,PHRASE
    FROM table_with_MY_STRING
    LEFT JOIN REFERENCE_TABLE -- to return NULL if no match
    ON PHRASE_LOCATION > 0
    QUALIFY
       Row_Number() -- return last match
       Over (PARTITION BY MY_STRING
             ORDER BY PHRASE_LOCATION DESC) = 1
    

    If this is not efficient enough another possible solution might utilize STRTOK_SPLIT_TO_TABLE/REGEXP_SPLIT_TO_TABLE: split the address into parts and then join those parts to PHRASE.