Having trouble with what I thought would be pretty simple. Trying to get a count of items that don't have a title, our system tracks records by 'last_updated_date'.
id work_item title last_updated_date
1 task1 ProjectA 2020-03-25 20:20:01.111
2 task2 ProjectA 2020-03-25 20:20:01.111
3 task3 2020-03-25 20:20:01.111
4 task4 ProjectB 2020-03-25 20:20:01.111
5 task5 ProjectC 2020-03-25 20:20:01.111
As you can see, I need to see what work items don't have a project. If I were to look at the history for 'task1' I would see something like following:
select work_item, last_updated_date, project
from table
where work_item = 'task1'
This would output:
work_item last_updated_date title
task1 2020-03-25 20:20:01.111 ProjectA
task1 2020-03-17 20:20:01.111 NULL
task1 2020-03-12 20:20:01.111 NULL
Based on this, I can see that task1 got a title assigned on 2020-03-25. What I need to know, is how many work items in the entire data set do not have a item assigned. So I want to check all records using the last updated date and check the latest last_updated_date to see if title is null. I tried the following, but I think I'm using MAX and/or group by wrong? I'm getting records back that have titles assigned, which makes me think it's checking the MAX value of the entire last_updated_date column, rather than each record within.
select id, title, MAX(last_updated_date) as "latest_timestamp"
FROM table
WHERE title is null
group by id, title
What I would like to see is only task3 show up as needing a title assigned.
One option uses a subquery to filter on the latest record per item. You can then count how many of them have no title:
select count(*)
from mytable t
where
last_updated_date = (
select max(t1.last_updated_date)
from mytable t1
where t1.work_item = t.work_item
)
and title is null
You can also use window functions:
select count(*)
from (
select t.*,
row_number() over(partition by work_item order by last_updated_date desc) rn
from mytable t
) t
where rn = 1 and title is null