Search code examples
sqlsql-serverrational-numbers

How best to represent rational numbers in SQL Server?


I'm working with data that is natively supplied as rational numbers. I have a slick generic C# class which beautifully represents this data in C# and allows conversion to many other forms. Unfortunately, when I turn around and want to store this in SQL, I've got a couple solutions in mind but none of them are very satisfying.

Here is an example. I have the raw value 2/3 which my new Rational<int>(2, 3) easily handles in C#. The options I've thought of for storing this in the database are as follows:

  1. Just as a decimal/floating point, i.e. value = 0.66666667 of various precisions and exactness. Pros: this allows me to query the data, e.g. find values < 1. Cons: it has a loss of exactness and it is ugly when I go to display this simple value back in the UI.

  2. Store as two exact integer fields, e.g. numerator = 2, denominator = 3 of various precisions and exactness. Pros: This allows me to precisely represent the original value and display it in its simplest form later. Cons: I now have two fields to represent this value and querying is now complicated/less efficient as every query must perform the arithmetic, e.g. find numerator / denominator < 1.

  3. Serialize as string data, i.e. "2/3". I would be able to know the max string length and have a varchar that could hold this. Pros: I'm back to one field but with an exact representation. Cons: querying is pretty much busted and pay a serialization cost.

  4. A combination of #1 & #2. Pros: easily/efficiently query for ranges of values, and have precise values in the UI. Cons: three fields (!?!) to hold one piece of data, must keep multiple representations in sync which breaks D.R.Y.

  5. A combination of #1 & #3. Pros: easily/efficiently query for ranges of values, and have precise values in the UI. Cons: back down to two fields to hold one piece data, must keep multiple representations in sync which breaks D.R.Y., and must pay extra serialization costs.

Does anyone have another out-of-the-box solution which is better than these? Are there other things I'm not considering? Is there a relatively easy way to do this in SQL that I'm just unaware of?


Solution

  • I would probably go with Option #4, but use a calculated column for the 3rd column to avoid the sync/DRY issue (and also means you actually only store 2 columns, avoiding the "three fields" issue).

    In SQL server, calculated column is defined like so:

    CREATE TABLE dbo.Whatever(
       Numerator INT NOT NULL,
       Denominator INT NOT NULL,
       Value AS (Numerator / Denominator) PERSISTED
    )
    

    (note you may have to do some type conversion and verification that Denominator is not zero, etc).

    Also, SQL 2005 added a PERSISTED calculated column that would get rid of the calculation at query time.