Search code examples
sqlingres

Selecting an appropriate period or max/minimum period when outside of the period set


I'm trying to see if it's possible to efficiently select a period a given date belongs to.

Let's say I have a table

id<long>|period_start<date>|period_end<date>|period_number<int>

and lets say I want for every id the period that "2013-11-20" belongs to.

i.e. naively

 select id, period_number 
 from period_table 
 where '2013-11-20' >= period_start and '2013-11-20' < period_end

However, if my date is beyond any period_end or before any period_start, it won't find this id. In those cases I want the minimum (if before the first period_start) or the maximum (if after the last period_end).

Any thoughts if this can be done efficiently? I can obviously do multiple queries (i.e. select into the table as above and then do another query to figure out the min and max periods).

So for example

+--+------------+----------+-------------+
|id|period_start|period_end|period_number|
+--+------------+----------+-------------+
|1 |2011-01-01  |2011-12-31|1            |
|1 |2012-01-01  |2012-12-31|2            |
|1 |2013-01-01  |2013-12-31|3            |
+--+------------+----------+-------------+

If I want what period 2012-05-03 belongs to, my naive sql works and returns period #2 (1|2 as the row, id, period_number). However, if I want what period 2014-01-14 (or 2010-01-14) it can't place it as it's outside the table.

Therefore since "2014-01-14" is > 2013-12-31, I want it to return the row "1|3" if I chose 2010-01-14, I'd want it to return 1|1, as 2010-01-14 < 2011-01-01.

The point of this is that we have a index table that keeps track of different types of periods and what is their relative value (think quarter, half year, years) for many different things and they all don't line up to normal years. Sometimes we want to say we want period X (some integer) relative to date Y. If we can place Y within the table and figure out Y's period_number, we can easily do the math to figure out what to add/subtract to that value. If Y is outside the bounds of the table, we define Y to be the max/min of the table respectively.


Solution

  • Why aren't you creating "boundary periods"? Choose arbitrary beginning_of_time and end_of_time dates e.g. 01/01/0001 and 31/12/9999 and insert a fake period. Your example period_table will become:

    +--+------------+----------+-------------+
    |id|period_start|period_end|period_number|
    +--+------------+----------+-------------+
    |1 |0001-01-01  |2010-12-31|1            |
    |1 |2011-01-01  |2011-12-31|1            |
    |1 |2012-01-01  |2012-12-31|2            |
    |1 |2013-01-01  |2013-12-31|3            |
    |1 |2014-01-01  |9999-12-31|3            |
    +--+------------+----------+-------------+
    

    In this case, any query will retrieve one and only one row, e.g:

    select id, period_number from period_table 
    where '2013-11-20' between period_start and period_end
    
    +--+-------------+
    |id|period_number|
    +--+-------------+
    |1 |2            |
    +--+-------------+
    
    select id, period_number from period_table 
    where '2010-11-20' between period_start and period_end
    
    +--+-------------+
    |id|period_number|
    +--+-------------+
    |1 |1            |
    +--+-------------+
    
    select id, period_number from period_table 
    where '2014-11-20' between period_start and period_end
    
    +--+-------------+
    |id|period_number|
    +--+-------------+
    |1 |3            |
    +--+-------------+