I need to InitCap a set of data, however I'd like to keep a few selected words capitalized. I used REGEXP_REPLACE but ran into a problem. In my example below, it seems like Oracle applies functions to the replacement string BEFORE it resolves backreferences. This seems like a poor decision in my opinion. As a result, (see column 2), <x> and <y> are capitalized, but the backreferenced string is not! The workaround in column 3 does not work either. Another solution is breaking it into bits as a subquery, then combining them with the UPPER function in a parent query, which can be messy and ugly. Does anyone have a better/straightforward solution?
with testdata as (
select 'MI ROOFING LLC' bizname from dual
union all select 'LAKESHORE LLC NILES MI' from dual
union all select 'MIDLAND WILLCOX' from dual
)
select bizname,
regexp_replace(initcap(bizname),'((^|\W)(Mi|Llc)($|\W))',upper('<x>\1<y>')) bizname1,
regexp_replace(initcap(bizname),'((^|\W)(Mi|Llc)($|\W))',
upper(regexp_substr(initcap(bizname),'(^|\W)(Mi|Llc)($|\W)'))) bizname1_workaround
from testdata;
BIZNAME BIZNAME1 BIZNAME1_WORKAROUND
---------------------- ---------------------------------- -----------------------
MI ROOFING LLC <X>Mi <Y>Roofing<X> Llc<Y> MI RoofingMI
LAKESHORE LLC NILES MI Lakeshore<X> Llc <Y>Niles<X> Mi<Y> Lakeshore LLC Niles LLC
MIDLAND WILLCOX Midland Willcox Midland Willcox
3 rows selected.
I came up with this solution. I created an inline PLSQL function to do the task. It can be made into a regular function if needed. I used the Grave Accent [`] character (ASCII 96) as a delimiter to split-up bits into a list. I subsequently converted the list into separate rows with regexp_substr. Finally I recombined with ListAgg applying InitCap or Upper function based on the resulting bit. The delimiter character can be anything. I also limited the bits to 100 but it can be higher. Also for the exception list the delimiter should be vertical line.
with function InitCapWithAllCapsExceptionList (p varchar2, allCapsList varchar2) return varchar2 as
retval varchar2(8000);
begin
select listagg(case when regexp_like(rslt,'(^|\W)('||allCapsList||')($|\W)','i')
then upper(rslt) else rslt end ,'') within group (order by lvl) into retval
from (
select level lvl, regexp_substr(regexp_replace(initcap(p),'((^|\W)('||allCapsList||')($|\W))','`\1`',1,0,'i'),'[^`]+',1,level) rslt
from dual connect by level<100)
where rslt is not null;
return retval;
end;
testdata as (
select 'MI ROOFING LLC' bizname from dual
union all select 'LAKESHORE LLC NILES MI' from dual
union all select 'MIDLAND WILLCOX' from dual
)
select InitCapWithAllCapsExceptionList(bizname, 'MI|Llc') bizname from testdata;
BIZNAME
------------------------
MI Roofing LLC
Lakeshore LLC Niles MI
Midland Willcox
3 rows selected.