I have a table:
row_id attr_id attr_val
180000001 100 test1
180000001 101 test2
180000001 102 test3
180000001 103 test4
180000001 104 test5
180000002 100 test6
180000002 101 test7
180000002 102 test8
180000002 103 test9
180000002 104 test10
It has over 5 billion rows and the table size is around 1.4 TB. I generally run the following query:
select * from table1 where rec_id = 180000002;
and
select * from table1 where rec_id = 180000002 and va_id = 100;
What type of index should I be applying in Postgres to be most efficient given the space and my use case?
For these queries, you want the following index: (rec_id, va_id)
.
Do you really mean select *
? If you are able to reduce the number of columns in the select
clause to just a few columns, then you might want to add them to the index as well.
As an example, a query like:
select col1, col2 from table1 where rec_id = 180000002 and va_id = 100;
Would benefit index (rec_id, va_id, col1, col2)
: this makes the index covering, it the database might be able to execute it entirely by looking at the index only.