I need to balance available disk space with the expected size of data. What sort of hit to storage occurs when there is unused space?
Example: "dog" being stored in nvarchar(10) vs nvarchar(100). If I plan for the worst, and choose nvarchar(100) instead of nvarchar(10), how much extra disk space is being wasted if I go with nvarchar(100)?
nvarchar
storage size is 2 bytes per char + 2 extra bytes. The maximum length of the column doesn't matter - the storage size is determined by the actual data.
From official documentation:
nvarchar [ ( n | max ) ] Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^30-1 characters. The maximum storage size in bytes is 2 GB. The actual storage size, in bytes, is two times the number of characters entered + 2 bytes. The ISO synonyms for nvarchar are national char varying and national character varying.
(emphasis mine)
However, Please do not consider this a recommendation to use nvarchar(max)
for everything. Since max
is treated differently, it has some nasty side effects (performance hits).
Generally speaking, you should choose the column max size by your estimated actual data size. To be on the safe side, you might want to simply set the max size twice the expected size.
If you know you are only going to use a single ASCII supported language, you should consider using varchar
instead of nvarchar
, since it's storage size is half the storage size of nvarchar
:
varchar [ ( n | max ) ] Variable-length, non-Unicode string data. n defines the string length and can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size is the actual length of the data entered + 2 bytes. The ISO synonyms for varchar are charvarying or charactervarying
(Again, emphasis mine)