When I am using the LIKE
operator in SQLite, it is very slow compared to when I use the =
instead.
It takes about 14ms with the =
operator, but when I use LIKE
, it takes about 440ms. I am testing this with DB Browser for SQLite
. Here is the query that works fast:
SELECT re.ENTRY_ID,
GROUP_CONCAT(re.READING_ELEMENT, '§') AS read_element,
GROUP_CONCAT(re.FURIGANA_BOTTOM, '§') AS furigana_bottom,
GROUP_CONCAT(re.FURIGANA_TOP, '§') AS furigana_top,
GROUP_CONCAT(re.NO_KANJI, '§') AS no_kanji,
GROUP_CONCAT(re.READING_COMMONNESS, '§') AS read_commonness,
GROUP_CONCAT(re.READING_RELATION, '§') AS read_rel,
GROUP_CONCAT(se.SENSE_ID, '§') AS sense_id,
GROUP_CONCAT(se.GLOSS, '§') AS gloss,
GROUP_CONCAT(se.POS, '§') AS pos,
GROUP_CONCAT(se.FIELD, '§') AS field,
GROUP_CONCAT(se.DIALECT, '§') AS dialect,
GROUP_CONCAT(se.INFORMATION, '§') AS info
FROM Jmdict_Reading_Element AS re LEFT JOIN
Jmdict_Sense_Element AS
se ON re.ENTRY_ID = se.ENTRY_ID
WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT = 'example') OR
re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS = 'example')
GROUP BY re.ENTRY_ID
The slows down when I change
WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT = 'example') OR
re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS = 'example')
to
WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT LIKE 'example') OR
re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS LIKE 'example')
I need to do this so that I can use wildcards e.g.
WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT LIKE 'example%') OR
re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS LIKE 'example%')
Here is a link to the database itself: https://www.mediafire.com/file/hyuymc84022gzq7/dictionary.db/file
Thanks
Using DB Browser for sqlite on Win 10.
LIKE "example%"
) returns 33 rows in 1025msCreated fts4 tables like so:
create virtual table jre_fts using FTS4(entry_id,reading_element);
insert into jre_fts select entry_id, reading_element from Jmdict_Reading_Element;
create virtual table jse_fts using FTS4(entry_id,gloss);
insert into jse_fts select entry_id, gloss from Jmdict_Sense_Element;
It took 7390ms, database grew from 70,296KB to 110,708KB.
Modified the WHERE like so:
WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM jre_fts WHERE READING_ELEMENT MATCH '^example') OR
re.ENTRY_ID IN (SELECT ENTRY_ID FROM jse_fts WHERE GLOSS MATCH '^example')
Query returned 33 rows in 60ms.
I cannot test or analyze how FTS will work on the reading_element
column, but perhapse the approach shows promise.