I have been doing some heavy reading the last couple days on indexing and I'm trying to figure out the right way to index a query I have with a lot of constraints. I am using the postgres_ext gem to support array datatypes and GIN and GIST index types.
I have a two queries
.where("a_id IN (?) and b = ? and active = ? and ? != ALL(c) and ? = ANY(d)")
.where("a_id =? and active =? and ? != ALL(c)")
c and d are integer arrays
The indexes I plan on adding:
add_index :deals, [:a, :b], :where => "active = true"
add_index :deals [:c, :d], :index_type => :gin, :where => "active = true"
Will postgres use both of these multicolumn indexes in the first query?
Should array datatypes always be in "gin" index types? or can you also put them in a b-tree index?
and finally will the first index be used for 'a' in both of the queries?
Additional Information:
I am using PostgreSQL 9.1.3
create_table "table", :force => true do |t|
t.integer "a_id" ##foreign key
t.string "title"
t.text "description", :default => ""
t.boolean "active", :default => true
t.datetime "created_at", :null => false
t.datetime "updated_at", :null => false
t.integer "b",
t.integer "c", :limit => 8, :array => true
t.integer "d", :array => true
end
Regarding arrays and GIN, you can have a b-tree index of arrays but it isn't useful for operations like "array contains element". You need GIN or GiST for that and only GIN is supported as a built-in index for all array types.
You can also use the intarray
extension and its GiST index type for integer arrays that will perform better under write load but worse under read load.
As for determining whether Pg will use both indexes, the best way to tell is to use EXPLAIN ANALYZE
and see. Get the statement Rails executes from the PostgreSQL log by enabling log_statement
or from the Rails logs with SQL logging on. Then run it in psql
with explain analyze
. Alternately, use the auto_explain
extension to capture performance reports on the queries as they run.
I have the feeling that you'll find that Pg can't combine a GiST or GIN and a b-tree index in the same filter. Combining indexes requires a bitmap index scan and that's IIRC only available for two b-tree indexes. You'd probably need to add the extra columns to the GiST or GIN index, but that'll increase the index size quite dramatically and may not be worth it.
You really need to use explain analyze
to see how it works in the real world on sample or production data.
When working with multicolumn indexes, keep in mind that at least for b-tree indexes Pg can use an index on (a,b)
for queries that filter on a
or on both a
and b
, but not for queries that filter only on b
. Indexes are usable left-to-right, you can't use an index to search for a value on the right side of the index unless you're also searching on all values to the left of it.