Search code examples
sqlregexhiveimpalarlike

how to search for specific whole words within a string , via SQL, compatible with both HIVE/IMPALA


I need to search a column(varchar) for specific whole words. I'm using the query below but not getting the desired results;

    select *
    from table1
    WHERE upper(c.name) RLIKE ('FECHADO|CIERRE|CLOSED|REVISTO. NORMAL.')

My problem is to guarantee that, for example with the word 'CLOSED', that only matches; 'Case Closed', but not 'Case Disclosed'. The query above can't match whole words only. Can anyone help me to find the best way to achieve those results, both in HIVE an IMPALA.

My best regards


Solution

  • You can add word boundary \\b to match only exact words:

    rlike '(?i)\\bFECHADO\\b|\\bCIERRE\\b|\\bCLOSED\\b'
    

    (?i) means case insensitive, no need to use UPPER.

    And the last alternative in your regex pattern is REVISTO. NORMAL.

    If dots in it should be literally dots, use \\.

    Like this: REVISTO\\. NORMAL\\.

    Dot in regexp means any character and should be shielded with two backslashes to match dot literally.

    Above regex works in Hive. Unfortunately I have no Impala to test it