Search code examples
postgresqlindexingdatabase-performancesql-like

What kind of index should I create to make "WHERE col1 LIKE '0000%' AND col2 = 'somevalue'" faster?


I tried the following queries in order to search inside a quad tree using PostgreSQL's LIKE operator. In column col3, there are words like '0133002112300300320' inserted, which describes the quad tree's path.

CREATE TABLE table1
(col1 CHARACTER(9) NOT NULL,
 col2 INTEGER NOT NULL,
 col3 CHARACTER VARYING(64),
 col4 INTEGER NOT NULL,
 col5 DOUBLE PRECISION NOT NULL,
 PRIMARY KEY(col1,col2,col3));

-- Performs sequential search
SELECT col1,col2,col3,col4,col5
FROM table1
WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%';

The problem is that the PRIMARY KEY index I set doesn't work with WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%'. It seems that I can't use LIKE operator with AND operator at the same time if you want to use the created index.

Are there any special indices I can create to make this SELECT faster?


Solution

  • It seems that I can't use LIKE operator with AND operator at the same time if you want to use the created index.

    The index can be used in that case. Here's how with your exact table and exact query, with random well-distributed contents among 100k rows:

    insert into table1 select 
       (random()*10000)::int,
       (random()*10000)::int,
        md5(random()::text),
        0,0 
        from generate_series(1,100000);
    
    ANALYZE table1;
    
    EXPLAIN ANALYZE SELECT col1,col2,col3,col4,col5
    FROM table1
    WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%';
    

    Result:

     Index Scan using table1_pkey on table1  (cost=0.00..8.32 rows=1 width=59) (actual time=0.022..0.022 rows=0 loops=1)
       Index Cond: ((col1 = 'somevalue'::bpchar) AND (col2 = 0))
       Filter: ((col3)::text ~~ '01330021123003003%'::text)
     Total runtime: 0.050 ms
    (4 rows)
    
    

    That Index Scan using table1_pkey shows that the index gets used for that query.

    If it doesn't with your dataset, the most plausible reason is that you're searching for values that are too common.