Search code examples
sql-serverdatabasetypesvarchar

Is there a legitimate reason for using so many varchar fields? (MS SQL DB)


I'm working on data migration from an old IBM Universe-based system to a new enterprise-level data-information management system, and learning about database design in the process.

I took a look at the back-end database structure of the new system (it's a MS SQL DB, with about 100 tables), and find some things pretty odd. But I don't know if my inexperience is the reason I think that and this is just standard practice, or if these oddities are really just bad database/application design.

For example:

  • Some date fields are varchar(20)
  • Fields that store measurements are varchar(50), rather than something like a decimal and an enum to store the units of measurement
  • ISBN 10 & 13 number fields are varchar(50)
  • Some look-up ID foreign keys are varchar(100), even though the actual look-up table primary key is an int
  • Some fields are varchar(0)
  • Additional separate fields for storing month & year, each of which is varchar(250) - I don't know what kind of design decision would require a maximum of 250 characters for the year, unless they really went overkill on their Y2K compliancy, or decided to use seconds since the beginning of the universe to store datetime

And plenty of others. The DB looks to be more than half varchar fields.

I should also mention that all the varchar fields in the DB are actually n-varchar - so it's all unicode, even the fields that only store numbers.

Is there any legitimate argument that using so many varchar fields could be the best option, in some circumstances? (flexibility... maybe...?)


Solution

  • Some date fields are varchar(20)

    This one of the things that will always get you in trouble in the future, now you can have invalid dates and then you can't do normal date arithmetic.

    Some look-up ID foreign keys are varchar(100), even though the actual look-up table primary key is an int

    This is bad because you will get conversions when you join and this will make it slower

    Store decimals as decimals...sooner or later you will get bad data in there and then it will be a classic case of GIGO (Garbage In Garbage Out)

    Also using nvarchar to store numbers is insane, you just doubled the storage needed to store those numbers, this will then store less rows per page and you will need more IO to bring back the same number of rows if had used regular varchars or integers