I have a weird issue with my FTS4 index in SQLite3, namely that a MATCH
query for one term returns not the exact match but another, similar one, and vice versa.
Here is an example to illustrate it better:
SELECT name FROM test_idx WHERE name MATCH 'lehmbruck-museum';
-- "Lehmbruck-Archiv"
SELECT name FROM test_idx WHERE name MATCH 'lehmbruck-archiv';
-- "Lehmbruck-Museum"
It seems to have something to do with the dash, here is a similar case that exhibits the same behavior:
SELECT name FROM test_idx WHERE name MATCH 'some-thing';
-- "some-thang"
SELECT name FROM test_idx WHERE name MATCH 'some-thang';
-- "some-thing"
Here is how this test database is built, in case somebody wants to have a go at reproducing it:
CREATE VIRTUAL TABLE test_idx USING fts4(name);
INSERT INTO test_idx (name) VALUES
('some-thing'), ('some-thang'),
('Lehmbruck-Museum'), ('Lehmbruck-Archiv');
SELECT name FROM test_idx WHERE name MATCH 'lehmbruck-museum';
What you pass to MATCH
here is a full text query expression. The -
character is a unary operator in that expression language that is a stand in for the NOT
set operation, and is certainly giving you your unexpected results. Notably - the exact opposite of what you expect! Of course, it is finding exactly what the query is instructed to find - the string lehmbruck
and NOT museum
at the end!
You'll need to escape it to get the results you want - or perhaps employ the LIKE
operator if you are looking at a single column in a table.
Some more information on this expression language can be found in section 3 of the FTS3 and FTS4 documentation on the SQLite doc site here.