Search code examples
oraclefunctionplsqlsubstrnls-lang

Oracle NLS Settings Special Characters Replacement Handling


I found a weird situation in Oracle

Situation Faced

  • I used substr to split the character one by one and if the byte of length >= 2 then replace it with spacing.

  • It work successfully in my session but it not work in the Oracle Job. I found that it would possibility that replace more characters like Aöasd It will return with value A sd

For my further testing, I think this is because the NLS Settings

  • Database's NLS Settings = 'German.Germany.AL32UTF8'
  • User's Session's NLS Settings = 'AMERICAN.AMERICA.AL32UTF8'

In German.Germany.AL32UTF8 the substr function unable to grab the correct character especially after the German Special Character. However it is work in User's Session

Besides that, I also had try to use regex_replace but it is also not work for the Oracle Job it unable to replace it.

string:= REGEXP_REPLACE(ps_string, '[äöüßÄÖÜ]', ' ');

Instead of changing on the NLS Settings is there have any other solution for this?


Solution

  • Your problem is different character set in different database.

    In my database with NLS Settings = 'AMERICAN.AMERICA.AL32UTF8'

    WITH CTE AS (SELECT 'Aöasd' STR FROM DUAL)
    SELECT SUBSTR(STR,LEVEL,1) AS NEW_ST, 
           LENGTHB(SUBSTR(STR,LEVEL,1)) AS NEW_ST_LB 
      FROM CTE
    CONNECT BY LEVEL <= LENGTH(STR);
    
    N  NEW_ST_LB
    - ----------
    A          1
    ö          2
    a          1
    s          1
    d          1
    

    Now, I tried the same query in my other DB with German character set and result is 1 for all characters. (WE8ISO8859P15)

    N  NEW_ST_LB
    - ----------
    A          1
    ö          1 --<--- See this
    a          1
    s          1
    d          1
    

    The thing is in WE8ISO8859P15 character set, It takes only 1 byte to store it rather then 2 Bytes.

    Solution:

    Use the CONVERT function to convert each character to UTF8 and then check the length of the character in terms of byte and replace it if it is greater than 1.

    Query:

    WITH CTE AS (SELECT 'Aöasd' STR FROM DUAL)
    SELECT CONVERT(SUBSTR(STR,LEVEL,1), 'UTF8' ) AS NEW_ST, 
           LENGTHB( CONVERT(SUBSTR(STR,LEVEL,1), 'UTF8' )) AS NEW_ST_LB 
      FROM CTE
    CONNECT BY LEVEL <= LENGTH(STR);
    

    Result in DB with NLS Settings = 'AMERICAN.AMERICA.AL32UTF8'

    NEW_ST                NEW_ST_LB
    -------------------- ----------
    A                             1
    ö                             2
    a                             1
    s                             1
    d                             1
    

    Result in DB with NLS Settings = 'German.Germany.AL32UTF8'

    NEW_S  NEW_ST_LB
    ----- ----------
    A              1
    ö             2
    a              1
    s              1
    d              1