Search code examples
postgresqlcastingindexing

Create index on first 3 characters (area code) of phone field?


I have a Postgres table with a phone field stored as varchar(10), but we search on the area code frequently, e.g.:

select * from bus_t where bus_phone like '555%'

I wanted to create an index to facilitate with these searches, but I got an error when trying:

CREATE INDEX bus_ph_3 ON bus_t USING btree (bus_phone::varchar(3));

ERROR: 42601: syntax error at or near "::"

My first question is, how do I accomplish this, but also I am wondering if it makes sense to index on the first X characters of a field or if indexing on the entire field is just as effective.


Solution

  • Actually, a plain B-tree index is useless for pattern matching with LIKE (~~) or regexp (~), even with left-anchored patterns, if your installation doesn't run on locale "C" - which is the typical case. Here is an overview over pattern matching and indices:

    Create an index with the varchar_pattern_ops operator class (matching your varchar column) and be sure to read the chapter on operator classes in the manual.

    CREATE INDEX bus_ph_pattern_ops_idx ON bus_t (bus_phone varchar_pattern_ops);

    Your original query can use this index:

    ... WHERE bus_phone LIKE '555%'
    

    Performance of a functional index on the first 3 characters as described in the answer by @a_horse is pretty much the same in this case.

    fiddle
    Old sqlfiddle

    Generally, a functional index on relevant leading characters would be be a good idea, but your column has only 10 characters. Consider that the overhead per tuple is already 28 bytes. Saving 7 bytes is just not substantial enough to make a big difference. Add the cost for the function call and the fact that xxx_pattern_ops are generally a bit faster.
    In Postgres 9.2 or later the index on the full column can also serve as covering index in index-only scans.

    However, the more characters in the columns, the bigger the benefit from a functional index.
    You may even have to resort to a prefix index (or some other kind of hash) if strings get too long. There is a maximum length for indices. See:

    If you decide to go with the functional index, consider using the xxx_pattern_ops variant for a small additional performance benefit. Be sure to read about the pros and cons in the manual and in Peter Eisentraut's blog entry:

    CREATE INDEX bus_ph_3 ON bus_t (left(bus_phone, 3) varchar_pattern_ops);
    

    Explanation for error message

    You'd have to use the standard SQL cast syntax for functional indices. This would work - pretty much like the one with left(), but like @a_horse I'd prefer left().

    CREATE INDEX bus_ph_3 ON bus_t USING btree (cast(bus_phone AS varchar(3));