Search code examples
sqlfirebirdicufirebird2.5

Firebird unicode SQL search behaves different on LIKE and = on some characters


Running Firebird Classic v2.5.9 on Gentoo Linux and using system wide ICU library (version 72) I came across strange search results between the = operator and the LIKE operator search regarding certain unicode characters (in my case the polish l).

I simplified the use case to the following example:

select 
    case 
        when 'l' collate unicode_ci_ai = 'ł' collate unicode_ci_ai 
            then 'equal' 
            else 'differ' 
    end 
from 
    rdb$database;

-- Result: equal

select 
    case 
        when 'l' collate unicode_ci_ai like 'ł' collate unicode_ci_ai 
            then 'equal' 
            else 'differ' 
    end 
from 
    rdb$database;

-- Result: differ

Paradoxically: when I create a unique index on a column with a unicode_ci_ai collation and a unique constraint things behave different:

CREATE TABLE foo 
(
    COL1  VARCHAR(10) COLLATE UNICODE_CI_AI
);

ALTER TABLE foo ADD CONSTRAINT UNQ1_FOO UNIQUE (COL1);

INSERT INTO foo VALUES ('l');
-- Result: OK

INSERT INTO foo VALUES ('ł');
-- Result: Error
--  violation of PRIMARY or UNIQUE KEY constraint "UNQ1_FOO" on table "FOO".
--  Problematic key value is ("COL1" = 'l').

With other 'non-standard' characters (é for example) the both search strategies give the same result:

select case when 'e' collate unicode_ci_ai = 'é' collate unicode_ci_ai then 'equal' else 'differ' end from rdb$database;
-- Result: equal

select case when 'e' collate unicode_ci_ai like 'é' collate unicode_ci_ai then 'equal' else 'differ' end from rdb$database;
-- Result: equal

Is it a bug or a feature? It the latter - then why?


Solution

  • The issue seems to be tackled by the Firebird team:

    https://github.com/FirebirdSQL/firebird/issues/5044

    and already fixed in 4.0 Beta 2.