Search code examples
databasedatabase-normalization

First normal form and temporal data


The first normal form says that row ordering should not matter. Does this mean that a table with date as part of the key is not 1NF? e.g. Consider a table of ticker prices where date/time is part of the PK. In this case, you get the last price by ordering the data by date and selecting the top 1 row. Does this mean that for to fulfill 1NF you need to split the table into: 1) TickerCurrentPrice (1 row per ticker) 2) TickerHistoricalPrice Thanks


Solution

  • 1NF is aspect of a table representing a relation, not of a table as such.

    If your relation says ticket HAS price, that it's a 1NF violation, as you cannot determine if a ticket HAS or HAS NOT the price by looking on a single record. You'll need to fetch all prices on this ticket and select last of them, which violates the non-ordering rule of the 1NF.

    If your relation says ticket HAD BEGUN TO COST price ON date, then it's in the 1NF all right, because each record says what it says: this ticket costs this price from this date.

    Thus, we say that this table does not comply with the 1NF when representing the first relation, but does comply when representing the second one.

    The table itself remains the same, of course.

    It doesn't necessary mean though that you need to split your tables.

    The whole point of relational databases is that you can use relational operators to convert one relation to another.

    What is a relation in terms of the RDBMS? It's a table showing all combinations of all possible values that are in this relation between theirselves.

    For instance, if we need to construct equality relation on natural numbers from 1 to 5, we have this table:

    1 1
    2 2
    3 3
    4 4
    5 5
    

    All pairs that appear in this table are in equality relation; all pairs that don't appear, are not. We don't see (2, 3) here, or (4, 5), as they are not equal.

    But you don't need to keep the whole pair in the database. You keep single values instead and write a query:

    SELECT n1.number, n2.number
    FROM number n1, number n2
    WHERE n1.number = n2.number
    

    , which gives you the same result.

    Actually, normal forms let you keep simplest possible relation tables in the database and construct more complex relations from them using SQL queries.

    In your case, if you write a query (or define a view) in the following way:

    SELECT ticket, price
    FROM mytable
    WHERE (ticket, date) IN (
      SELECT ticket, MAX(date)
      FROM mytable
      GROUP BY
        ticket
      )
    

    , you get relation (ticket HAS price) from (ticket HAD BEGUN TO COST price ON date) exactly as if you were keeping the whole table in the database.