I'm storing localized strings in a single datatable using MS Sql (2008 or whatever). Most of the strings are short and can be represented with varchar(200), while about 10% are much longer require something like varchar(5000). My question is, is there a performance advantage when retrieving shorter strings if I break this into two tables like this:
CREATE TABLE ShortTextTable(ID bigint IDENTITY(1,1) NOT NULL, TextValue nvarchar(200))
CREATE TABLE LongTextTable(ID bigint IDENTITY(1,1) NOT NULL, TextValue nvarchar(4000))
Versus:
CREATE TABLE TextTable(ID bigint IDENTITY(1,1) NOT NULL, TextValue nvarchar(4000))
This data will rarely be updated, I'm only really concerned about reads.
It depends. Could be premature optimization.
With smaller columns, you will fit more rows per page, obviously, but your usage patterns may mean that the horizontal partition you are proposing is not very efficient, since it's getting things from both new tables. I think we need to see the read usage pattern and how the table is joined to.
Also, it is partitioning a space that is logically one space, and will no longer be manageable as one space (i.e. add an index in both places, etc.)
You'd really have to see a bottle neck and profile the proposed change before I would partition it like this.
I'm not sure, but it might be possible to literally partition (using SQL Server's partitioned tables feature) the table based on length of the column. Again, whether this would help would need to be profiled.