Search code examples
sqldatabasetypesrelationshipssqldatatypes

SQL Database storing different types of values (in or simulated as a single field)


In a database I want to be able to assign values of a varying type to variables in a variable table. So do I need a separate value table for each value type? If so, I am not sure how you would actually link Values to the right table and thus right value. How can I achieve what I am after?

Variables
    ID
    Name

VariableValuesLink
    ID
    IDVars
    IDVals

Values
    IDvals

ValuesValueLink
    ID
    IDvals
    IDval

ValuesInt
    IDval
    IntVal

ValuesFloat
    IDval
    FloatVal

ValuesDouble
    IDval
    DoubleVal

etc...
etc...
etc...
etc...

The aim is for me to get something like this:

Variable: 
    ezas123
Values:
    1 (Int)
    2.0 (Float)
    3.0 (Double)

Variable:
    QuickFox
Values:
    The (TinyText)
    Quick (TinyText)
    Brown (TinyText)
    Fox (TinyText)
    Jumped (TinyText)
    Over (TinyText)
    The (TinyText)
    Lazy (TinyText)
    Dog (TinyText)

Variable:
    Pangrams
Values:
    The Quick Brown Fox Jumped Over The Lazy Dog (Text)
    How quickly daft jumping zebras vex (Text)

So when I query the DB I would be able to get back this set of results (where the values are of a varying type)

Variable    Value
ezas123     1
ezas123     2.0
ezas123     3.0
QuickFox    The
QuickFox    Quick
QuickFox    Brown
QuickFox    Fox
QuickFox    Jumped
QuickFox    Over
QuickFox    The
QuickFox    Lazy
QuickFox    Dog
Pangrams    The Quick Brown Fox Jumped Over The Lazy Dog
Pangrams    How quickly daft jumping zebras vex

Solution

  • A couple of points:

    • In your example, the variable ezas123 has three values with different data types, meaning that the variable itself doesn't actually have a defined data type. This will probably cause problems downstream and is likely to indicate that the data is pretty poorly defined. I'd look at including a restriction that all of the values for a given variable must have the same data type.

    • Hogan's SQL query makes the point that whenever you list the values in the way you requested (i.e. across variables with different data types) you'll be having to cast the result to varchar or similar to display it (since you can't have values with different data types in the same output column). With that in mind, do you really need different data types, or would a varchar type work well for all the data you're dealing with?

    If the different types are needed, I'd look at putting all of the different IntVal, FloatVal, DoubleVal, ... columns into one table. Your table definitions could then look something like:

    Variables
          ID          NOT NULL
         ,Name        NOT NULL
         ,DataType    NOT NULL CHECK (DataType IN ('INT','FLOAT','DOUBLE','TEXT'))  
       ,CONSTRAINT PK_Variables PRIMARY KEY (ID)
       ,CONSTRAINT UQ_Variables_1 UNIQUE (Name)
       ,CONSTRAINT UQ_Variables_2 UNIQUE  (ID,DataType)
    
        Values
          IDvals      NOT NULL
         ,ID          NOT NULL
         ,DataType    NOT NULL CHECK (DataType IN ('INT','FLOAT','DOUBLE','TEXT'))
         ,IntVal      NULL
         ,FloatVal    NULL
         ,DoubleVal   NULL
         ,TextVal     NULL
       ,CONSTRAINT PK_Values PRIMARY KEY (IDvals)
       ,CONSTRAINT FK_Values_Variable FOREIGN KEY (ID,DataType) REFERENCES Variables(ID,DataType)
       ,CONSTRAINT CH_Values CHECK ( NOT(DataType <> 'INT'    AND IntVal     IS NOT NULL)  AND
                                     NOT(DataType <> 'FLOAT'  AND FloatVal   IS NOT NULL)  AND
                                     NOT(DataType <> 'DOUBLE' AND DoubleVal  IS NOT NULL)  AND
                                     NOT(DataType <> 'TEXT'   AND TextVal    IS NOT NULL)
                                    )
    
    • The UNIQUE constraint on Variables(ID,DataType) will probably be required (DBMS?) to allow you to make it the subject of a FK;
    • The CHECK constraints ensure that only valid data types are being used and that the only the correct value columns can be populated;
    • Having DataType in Values as well as Variables means that a combination of FK and CHECK can be used to ensure that all values for a given variable have the same data type, rather than having to use triggers or application logic.

    A query against the tables then looks something like:

    SELECT v.name as Variable,
           COALESCE(cast(a.IntVal       as varchar(max)),
                    cast(a.FloatVal     as varchar(max)),
                    cast(a.DoubleVal    as varchar(max)),
                    cast(a.TextVal      as varchar(max)),
                    '') as Value
    FROM 
    Variables V
    JOIN Values a on V.ID = a.ID AND v.DataType = a.DataType
    

    This could also be written (probably more correctly) with a CASE based on Variable.DataType being used to choose the relevant column.

    Having all of the values in one table means less tables/constraints/indexes in the database and means that extending the solution to hold new data types just means adding new columns to the Values table (and modifying the constraints) rather than adding new tables.