Search code examples
sqloracleregexp-substr

Oracle SQL Select Split "City, State" to two columns (w/ some instances of just state and no comma) and Trim whitespaces


I've seen this solution for data that always has commas but can't seem to get this to work when there are no commas.

Data is coming from a "location" field that looks like this... (where some data is "city, state" and other rows are "state" with no city listed).

[location_str]
 Boston, MA
 Miami, FL
 MO
 AK
Fairbanks, AK

I'm trying to get them to two columns [City] and [State].

This code almost works...

SELECT  REGEXP_SUBSTR (location_str, '[^,]+', 1, 1)    AS City,
        REGEXP_SUBSTR (location_str, '[^,]+', 1, 2)    AS State
FROM    Locations_Table

The result is...

[City]      [State]
 Boston      MA
 Miami       FL
 MO          NULL
 AK          NULL
 Fairbanks   AK

It is close I just need the rows with STATE only and no comma to wind up in the [State] column and for the NULL to wind up in the [City] column.

The last little bit is how to clean up the location_str data which contains trailing and leading spaces in the database (generated somewhere else).

[location_str]
" Fairbanks, AK"
"Fairbanks, AK      "
"DC"
"DC      "
"DC   "

I can try this tomorrow but can I wrap the REGEXP_SUBSTR in a TRIM func? would the following work on this...

TRIM(REGEXP_SUBSTR (location_str, '[^,]+', 1, 1)) AS City

Solution

  • You don't even need regular expressions for your requirement; standard string functions, like instr and substr, are faster, and they work perfectly in this situation.

    The WITH clause is not part of the query; rather, I include it only to generate test data. Remove it, and run the query (starting at select location_str, ...) on your actual table.

    As you suspected, TRIM(...) can be used to trim spaces from both sides of each resulting substring. Please note: this will not handle spaces within a name, for example 'San Francisco' or 'North Carolina'. If needed, these can be handled too, with a bit more work.

    with
      test_data (location_str) as (
        select 'Boston, MA'          from dual union all
        select '   Miami ,  FL  '    from dual union all
        select 'MO'                  from dual union all
        select '     AK'             from dual union all
        select 'Fairbanks   , AK   ' from dual
      )
    select location_str,
           trim(substr(location_str, 1, instr(location_str, ',') - 1)) as city,
           trim(substr(location_str, instr(location_str, ',') + 1))    as state
    from   test_data
    ;
    
    LOCATION_STR        CITY                STATE              
    ------------------- ------------------- -------------------
    Boston, MA          Boston              MA                 
       Miami ,  FL      Miami               FL                 
    MO                                      MO                 
         AK                                 AK                 
    Fairbanks   , AK    Fairbanks           AK