I am trying to calculate the amount of orders we have received between two times, and how many of those have been packed by the warehouse staff.
We have an 'orders' table with the orders and when an order is finished and unfortunately when it's been dealt with in a manner such as a cancelled order that shouldn't be added to the calculation it will recieve a "Finished" status which is too much of a legacy code that I can't change.
In order to track if the order has been packed I instead want to join another table called 'orders_sent', but unfortunately if an order has more than one package, there will be more than one row in that table for that orders_id.
So what I want to do is check if an order has been placed, and if it has "finished" and also exists in the sent table I will count it as sent, if it's "finished" but not in the sent table, it should be ignored completely, and finally I want to count all rows total, ignoring any duplicates because of more than one entry in the sent table.
Here is what I come up with so far.
"SELECT " &_
" COUNT(CASE orders_status WHEN 'Finished' THEN 1 ELSE NULL END) AS Sent, " &_
" COUNT(*) AS Total " &_
"FROM " &_
" orders " &_
" LEFT JOIN orders_sent ON orders_id = orders_sent_orderid " &_
"WHERE " &_
" orders_date > '" & datetime & "' " &_
" AND orders_date < '" & dateAdd("d",1,datetime) & "' " &_
" AND NOT (" &_
" orders_status = 'FINISHED' " &_
" AND orders_sent_id IS NULL) "
I tried grouping on orders_sent_id, but that didn't work because all orders that doesn't exists in the sent table gets grouped together. Trying to group by orders_id gave me wonky results which I can't quite explain (6 sent and 6 total). Not grouping gives me 33 sent and 54 total.
Running the SQL, skipping the count and just showing the orders_ID's I get 47 orders total in that time period using the SQL below. Not grouping show a few duplicated IDs because of orders_sent containing multiple rows for a few orders.
"SELECT " &_
" orders_id AS oid " &_
"FROM " &_
" orders " &_
" LEFT JOIN orders_sent ON orders_id = orders_sent_orderid " &_
"WHERE " &_
" orders_date > '" & datetime & "' " &_
" AND orders_date < '" & dateAdd("d",1,datetime) & "' " &_
" AND NOT (" &_
" orders_status = 'Skickad och avslutad' " &_
" AND orders_sent_id IS NULL) " &_
"GROUP BY " &_
" orders_id"
So how can I get rid of the duplicates from the JOIN so they don't mess up the total?
Based on @P.Salmon's comment, this is the corrected SQL which seem to have fixed my problem:
"SELECT " &_
" COUNT(CASE WHEN orders_status = 'Finished' AND os.orders_sent_ts < '" & dateAdd("h",27,datetime) & "' THEN 1 ELSE NULL END) AS Sent, " &_
" COUNT(*) AS Total " &_
"FROM " &_
" orders " &_
" LEFT JOIN (" &_
" SELECT DISTINCT " &_
" orders_sent_orderid, " &_
" orders_sent_ts " &_
" FROM " &_
" orders_sent "&_
" WHERE " &_
" 1) AS os ON orders_id = os.orders_sent_orderid " &_
"WHERE " &_
" orders_date > '" & datetime & "' " &_
" AND orders_date < '" & dateAdd("d",1,datetime) & "' " &_
" AND NOT (" &_
" orders_status = 'Finished' " &_
" AND os.orders_sent_orderid IS NULL) "
I added in a timestamp from the orders_sent table in my CASE for sent to make sure I can go back to any date and see how it worked that day. The +27hours is because the trucks leave at 15:00, but the cutoff time for orders that are supposed to be sent the same day is at 12:00.