Search code examples
mysqlperformanceintegerinnodbvarchar

How much does performance change with a VARCHAR or INT column - MySQL


I have many tables, with millions of lines, with MySQL. Those tables are used to store log lines. I have a field "country" in VARCHAR(50). There is an index on this column. Would it change the performances a lot to store a countryId in INT instead of this country field ?

Thank you !


Solution

  • Your question is a bit more complicated than it first seems. The simple answer is that Country is a string up to 50 characters. Replacing it by a 4-byte integer should reduce the storage space required for the field. Less storage means less I/O overhead in processing the query and smaller indexes. There are outlier cases of course. If country typically has a NULL value, then the current storage might be more efficient than having an id.

    It gets a little more complicated, though, when you think about keeping the field up-to-date. One difference with a reference table is that the countries are now standardized, rather than being ad-hoc names. In general, this is a good thing. On the other hand, countries do change over time, so you have to be prepared to add a "South Sudan" or "East Timor" now and then.

    If your database is heavy on inserts/updates, then changing the country field requires looking in the reference table for the correct value -- and perhaps inserting a new record there.

    My opinion is "gosh . . . it would have been a good idea to set the database up this way in the beginning". At this point, you need to understand the effects on the application of maintaining a country reference table for the small performance gain of making the data structure more efficient and more accurate.