Search code examples
sqloracle-databaseoracle11gnls-sort

Change NLS_SORT for a single query


In a large set of queries, fired in succession through a script, a single query uses REGEXP_LIKE. In it I do not want the regexp ([a-z]) to match diacritics (á, õ, ì). The only way I found to do this, is setting NLS_SORT to BINARY. However, this query should not affect any other query run afterwards.

For this reason, I want to set the variable NLS_SORT only for this query. However, I would rather not resort to PL/SQL. Is there any way to achieve this?

I would prefer this, but according to the docs I found, no such parameter exists:

SELECT * FROM dual WHERE REGEXP_LIKE('ë', '[a-z]', BINARY);

I can imagine something like this:

SELECT * FROM dual WHERE REGEXP_LIKE('ë', '[a-z]'); -- match
ALTER SESSION SET NLS_SORT = BINARY;
SELECT * FROM dual WHERE REGEXP_LIKE('ë', '[a-z]'); -- no match
ALTER SESSION SET NLS_SORT = DEFAULT; -- not working
ALTER SESSION RESET NLS_SORT;         -- not working
  • Is there any other way to have the regular expression exclude diacritics than to set NLS_SORT?
  • Is there any way to restore the NLS_SORT settings without PL/SQL?

Solution

  • So, thanks to Alex Poole, I reevaluated the match_paramater option in REGEXP_LIKE. From the documentation:

    match_parameter is a text literal that lets you change the default matching behavior of the function. You can specify one or more of the following values for match_parameter:

    • ...
    • 'c' specifies case-sensitive matching.
    • ...

    Although case-sensitive would not seem to affect whether a matches á or e matches ë, here is stated that if a sort is case-sensitive, it's also accent-sensitive:

    As of Oracle Database 10g, Oracle Database provides case-insensitive and accent-insensitive options for linguistic sorts. It provides the following types of monolingual and multilingual linguistic sorts:

    • Linguistic sorts that use information about base letters, diacritics, punctuation, and case. These are the standard monolingual and multilingual linguistic sorts that are described in "Using Linguistic Sorts".
    • Monolingual sorts that use information about base letters, diacritics, and punctuation, but not case, and multilingual sorts that use information about base letters and diacritics, but not case nor punctuation. This type of sort is called case-insensitive.
    • Monolingual sorts that use information about base letters and punctuation only and multilingual sorts that use information about base letters only. This type of sort is called accent-insensitive. (Accent is another word for diacritic.) Like case-insensitive sorts, an accent-insensitive sort does not use information about case.

    This states that all accent-insensitive sorts are case-insensitive, and therefore implicitly that a case-sensitive sort must be accent-sensitive.

    So, in conclusion:

    • Yes, there is a better way to exclude the diacritics: SELECT * FROM dual WHERE REGEXP_LIKE('ë', '[a-z]', 'c');
    • No, there is no way to change NLS_SORT for a single query and revert it, without knowing or saving the initial value.