Search code examples
sqlregexdatabaseteradatasqlperformance

Teradata SQL : Query a column for pattern that has spaces ( a.k.a a where clause to search for a pattern that has spaces, in a string )


I can do this

where Col  like '%Mystring%String2%' 

In Teradata SQL or I can use the ? for single char matches. How do I search for content pattern that is like this using Terdata SQL Regex
String<one of more instance of spaces or non alpha chars>string2 e.g. IS NOT NULL OR
IS NOT NULL There are 1 or more instances of spaces or some other Non alpha character in between 2 or more strings E.g. consider this string which is part of a sqltext in a PDCR database

sel t1.c1, t2.c2, t3.c3 from 
t1 , t2 ,t3
where t2.Cx is NULL and t3.cy IS NOT NULL and 
t3.Ca is         NULL          AND 
t3.cb is NULL AND          t3.c7 is        NOT NULL 
and t3.c10 not like  any ('%searchstring%','%string%','%||||%')

Note the varying amt of spaces between NOT and NULL and IS and NULL

So I wanted to form where clauses that'd check for various non alpha conditions like ( this is more pseudo-code like. sorry abt that )

 where Sqltext like '%NOT<1 or more instances of Spaces only>NULL%' 
    or SQLtext like  '%,\%<one or more instances of | character%' escape '\'

That was what I was looking for. REGEXP_SIMILAR seems promising. Trying to see how it can get long with a where clause


Solution

  • Use a regex \s to look for whitespace (space, tab, CRLF):

    WHERE REGEXP_SIMILAR(col,'.*?IS[\s|\|]+NOT.*?','i') = 1
    
    .*?                    -- any characters
       IS                  -- 1st string
         [\s|\|]+          -- one or more whitespaces and/or | 
                 NOT       -- 2nd string
                    .*?'   -- any characters
    

    Or REGEXP_INSTR, which is a bit shorter as you don't need the "any characters" at begin and end:

    WHERE REGEXP_INSTR(x,'IS[\s|\|]+NOT',1,1,0,'i') > 0