Search code examples
oracleregexp-substr

regex parsing address to not include apt or suite at end


I have numerous addresses that I need to match against each other (mother/children) some addresses have apt, suite etc etc at the end of the line-Been trying to find ways to get entire line w/o that last part-

with addy as (select '22 W JAMESTOWN ST APT 22' as addy from dual union 
              select '22 W JAMESTOWN ST 22' as addy from dual)
select addy.*,regexp_substr(addy,'(\d*)(\D*)(\s)',1,1,'i') as no_ from addy;

Final result should be: 22 W JAMESTOWN ST This is in oracle- The look ahead function does not seem to work- '/.+?(?=APT)/' has no effect

The first line works, the 2nd does not. Any input appreciated- TIA Lawrence


Solution

  • Instead of REGEX_SUBSTR, you can try REGEX_REPLACE to remove the extra information from the end of the address.

    Query

    WITH
        addy
        AS
            (SELECT '22 W JAMESTOWN ST APT 22' AS addy FROM DUAL
             UNION
             SELECT '22 W JAMESTOWN ST 22' AS addy FROM DUAL)
    SELECT addy.*, REGEXP_REPLACE (addy, '\s?(APT)?\s?\d+$') AS no_
      FROM addy;
    

    Result

                           ADDY                  NO_
    ___________________________ ____________________
    22 W JAMESTOWN ST 22        22 W JAMESTOWN ST
    22 W JAMESTOWN ST APT 22    22 W JAMESTOWN ST