I have an Oracle a table that has 3 columns. FILE_NUM, START_DATE, END_DATE. FILE_NUM is a unique identifier and so each row is unique.
FILE_NUM | START_DATE | END_DATE |
---|---|---|
1111 | Apr. 04, 1977 | Jun. 30, 1984 |
1112 | Jan. 03, 1987 | Sep. 05, 2010 |
1113 | May. 14, 1962 | Jan. 30, 1998 |
1114 | Oct. 09, 2001 | Aug. 22, 2025 |
1113 | May. 14, 1962 | Jan. 30, 1998 |
1114 | Nov. 18, 2003 | Aug. 22, 2028 |
etc |
I want to query a count of files that existed per decade. Something like:
Decade | Count |
---|---|
1960-1969 | 10 |
1970-1979 | 25 |
1980-1989 | 42 |
The logic is, if the file existed at any point in any decade, count it in each of those decades. So for file_num 1112, it should get counted in the 1980's, 1990's, 2000's, and 2010's.
I don't even know if this is possible. I've tinkered around in my query a few times, but I'm not sure I know enough functions or technique to get this sort of result.
Thanks for any help.
EDIT:
I should clarify how the counts will be made. I need to approach the date ranges in 3 ways in order to capture which file_nums existed within each decade.
I crafted up the following SQL. I am still trying to figure out if it's doing what I want it to do. If anyone has suggestion
with decades as
(
select
concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9) DECADE
from
table t
group by
concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9)
)
select
count(a.file_num),
decades.decade
from
table a,
decades
where
extract(year from a.start_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
or
extract(year from a.end_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
or
(extract(year from a.start_date) < extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY'))
and
extract(year from a.end_date) > extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
)
group by
decades.decade
order by
decades.decade
You can also take advantage of the connect by clause to do that. I assume you don't have any duplicate rows (these columns FILE_NUM, START_DATE, END_DATE should be unique) in your real data.
SELECT dec_start, dec_end, COUNT(*) nb
FROM (
SELECT t.*, level
, 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 10 dec_start
, 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 1 dec_end
FROM YourTable T
CONNECT BY
10 * TRUNC( EXTRACT ( YEAR FROM (START_DATE) ) / 10 ) + 10 * LEVEL - 10
< 10 * CEIL( EXTRACT ( YEAR FROM (END_DATE) ) / 10 )
AND PRIOR FILE_NUM = FILE_NUM
AND PRIOR START_DATE = START_DATE
AND PRIOR END_DATE = END_DATE
AND PRIOR SYS_GUID() IS NOT NULL
)
group by dec_start, dec_end
order by dec_start, dec_end
;