Search code examples
mysqlsqldatabaserdbms

Trying to get latest status for related shipment but the results I receive are incorrect


I am currently working on a project while trying to learn MySQL and I would like to join three tables and get the latest status for each related shipment. Here are the tables I'm working with (with example data):

shipments

id consignee tracking_number shipper weight import_no
1 JOHN BROWN TBA99900000121 AMAZON 1 101
2 HELEN SMITH TBA99900000190 AMAZON 1 102
3 JACK BLACK TBA99900000123 AMAZON 1 103
4 JOE BROWM TBA99900000812 AMAZON 1 104
5 JULIA KERR TBA99900000904 AMAZON 1 105

statuses

id name slug
1 At Warehouse at_warehouse
2 Ready For Pickup ready_for_pickup
3 Delivered delivered

shipment_status (pivot table)

id shipment_id status_id
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
6 1 2
7 2 2
8 3 2
9 4 2
10 5 2

all tables do have created_at and updated_at timestamp columns

Example of the results I'm trying to achieve

slug shipment_id status_id
ready_for_pickup 1 2
ready_for_pickup 2 2
ready_for_pickup 3 2
ready_for_pickup 4 2
ready_for_pickup 5 2

Here's the query I wrote to try to achieve what I'm looking for based on examples and research I did during the past couple of days. I find that sometimes there is sometimes a mismatch with the latest status that relates to the shipment

SELECT
  statuses.slug AS slug,
  MAX(shipments.id) AS shipment_id,
  statuses.id AS status_id,
FROM
  `shipments`
     INNER JOIN `shipment_status` ON `shipment_status`.`shipment_id` = `shipments`.`id`
     INNER JOIN `statuses` ON `shipment_status`.`status_id` = `statuses`.`id`
     GROUP BY
       `shipment_id`


Solution

  • Because we need to reference other fields from the same record that evaluates from the MAX aggregation, you need to do it in two steps, there are other ways, but I find this syntax simpler:

    SELECT
      shipments.id AS id,
      statuses.slug AS slug,
      statuses.id AS status_id,
      shipment_status.shipment_id as shipment_id
    FROM
      `shipments`
         INNER JOIN `shipment_status` ON `shipment_status`.`shipment_id` = `shipments`.`id`
         INNER JOIN `statuses` ON `shipment_status`.`status_id` = `statuses`.`id`
    WHERE 
       shipment_status.id = (
           SELECT MAX(shipment_status.id)
           FROM `shipment_status`
           WHERE shipment_status.shipment_id = shipments.id
       )
    

    try it out!

    This query makes the assumption that the id field is an identity column, so the MAX(shipment_status.id) represents only the most recent status for the given shipment_id