Search code examples
sqlindexingnon-clustered-index

tsql query and index question


I have a table that stores photo information with id as Primary key:

id(PK), title, album_id, posted_by, published, filename, tags, ratings, date_posted

This table will hold infor of 100+ Million photos and I need to run this query like these frequently:

1) get all photos (just id,filename,title columns) of a given album

select id, filename, title from photos where album_id = @AlbumId and published = 1

2) get all published photos of a given user but exclude photos of currently viewing album

select id, filename, title from photos where posted_by='bob' and album_id <>10 and published = 1

I want to avoid index and table scanning. I need to use seek(say 100%) as much as possible.

Can this be done? What type of index and on which columns would help me achieve this?

Thanks


Solution

  • In reality, you'll only be able to find this out yourself by measuring performance before you tweak, then tweak, and measure again and again.

    But based on your query, you should consider (or at least try this first) a non-clustered index like this:

    CREATE NONCLUSTERED INDEX IX01_Photos
      ON dbo.Photos(album_id, published, posted_by)
      INCLUDE(id, filename, title)
    

    Reasoning:

    • both your most frequent queries have WHERE clauses using album_id and published - so use these two columns first in your index
    • your second query also includes posted_by in the WHERE clause - put that into that same index as the third column
    • in order to avoid expensive bookmark lookups into the actual data table, you can include the id, filename, title columns in the index

    With all these things in place, you should be seeing mostly index seeks on that new non-clustered index to satisfy your queries. But again: lots of other factors also come into play which you probably haven't mentioned in your question and possibly not even thought about yourself - but this approach should give you a good starting point no less.