Search code examples
sqldatabase-administration

How to generate unique id based on start date and end date for the next 40 years in SQL table


I would like to generate unique id for the next 40 years for the following list of names. Pay attention to the start and end date. not every name has the same start and end date.

ID  Name    year    Start Date  End date 
1   Dance   2050    1/1/2050    12/31/2050
2   Dream   2050    7/1/2049    6/30/2050
3   Mocca   2050    7/1/2049    6/30/2050
4   Noway   2050    7/1/2049    6/30/2050
5   Papa    2050    7/1/2049    6/30/2050
6   Vintage 2050    1/1/2050    12/31/2050
7   Wave    2050    7/1/2049    6/30/2050

Solution

  • The simplest way to do this is to have a lookup table, where each row contains one of the items you wish to track. Add a column that generates a default unique surrogate key (an “identity” column, in SQL Server, and virtually all RDBMSes have something similar) whenever a row is added.

    That works if the value is just a date range, such as Jan 1, 2017 through Dec 31, 2017. If your unique value is based on the date range and the user name (example: MJ8 + Jan 1, 2017 + Dec 31, 2017), well, that’s probably a central table in your database schema, and is even more deserving of a surrogate key.

    Surrogate keys also help manage situations where you have duplicates (say, two entries for Smith + Jan 1, 2017 + Dec 31, 2017) -- one could be ID = 3, the other ID = 8710.

    A good argument for not basing a unique value on your “source data” is that it is smart data. Your unique id might end up being a string (or a very big number) like 2017010120171231. Awkward, but it works. The danger is, users (or developers) might see this and decide that, rather than “track it back to the source” (by perhaps joining on this key to the table containing the source data), they’ll just parse the string and convert each half back into the original date. This is generally not a good idea, as it slows down processing, produces hard to read (and debug and maintain) code, and can make things very difficult if the data being referenced changes.