Search code examples
databasedatabase-designindexingunique-index

should nearly unique fields have indexes


I have a field in a database that is nearly unique: 98% of the time the values will be unique, but it may have a few duplicates. I won't be doing many searches on this field; say twice a month. The table currently has ~5000 records and will gain about 150 per month.

Should this field have an index?

I am using MySQL.


Solution

  • I think the 'nearly unique' is probably a red herring. The data is either unique, or it's not, but that doesn't determine whether you would want to index it for performance reasons.

    Answer:

    5000 records is really not many at all, and regardless of whether you have an index, searches will still be fast. At that rate of inserts, it'll take you 3 years to get to 10000 records, which is still also not many.
    I personally wouldn't bother with adding an index, but it wouldn't matter if you did.

    Explanation:

    What you have to think about when deciding to add an index is the trade-off between insertion speed, and selection speed.

    Without an index, doing a select on that field means MySQL has to walk over every single row and read every single field. Adding an index prevents this.

    The downside of the index is that each time data gets inserted, the DB has to update the index in addition to adding the data. This is usually a small overhead, but you'd really notice it if you had loads of indexes, and were doing a lot of writes.

    By the time you get this many rows in your database, you'd want an index anyway as otherwise your selects would take all day, but it's just something to be aware about so that you don't end up adding indexes on fields "just in case I need it"