I am currently working on a project while trying to learn MySQL and I would like to join three tables and get the latest status for each related shipment. Here are the tables I'm working with (with example data):
shipments
id | consignee | tracking_number | shipper | weight | import_no |
---|---|---|---|---|---|
1 | JOHN BROWN | TBA99900000121 | AMAZON | 1 | 101 |
2 | HELEN SMITH | TBA99900000190 | AMAZON | 1 | 102 |
3 | JACK BLACK | TBA99900000123 | AMAZON | 1 | 103 |
4 | JOE BROWM | TBA99900000812 | AMAZON | 1 | 104 |
5 | JULIA KERR | TBA99900000904 | AMAZON | 1 | 105 |
statuses
id | name | slug |
---|---|---|
1 | At Warehouse | at_warehouse |
2 | Ready For Pickup | ready_for_pickup |
3 | Delivered | delivered |
shipment_status (pivot table)
id | shipment_id | status_id |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 4 | 1 |
5 | 5 | 1 |
6 | 1 | 2 |
7 | 2 | 2 |
8 | 3 | 2 |
9 | 4 | 2 |
10 | 5 | 2 |
all tables do have created_at and updated_at timestamp columns
Example of the results I'm trying to achieve
slug | shipment_id | status_id |
---|---|---|
ready_for_pickup | 1 | 2 |
ready_for_pickup | 2 | 2 |
ready_for_pickup | 3 | 2 |
ready_for_pickup | 4 | 2 |
ready_for_pickup | 5 | 2 |
Here's the query I wrote to try to achieve what I'm looking for based on examples and research I did during the past couple of days. I find that sometimes there is sometimes a mismatch with the latest status that relates to the shipment
SELECT
statuses.slug AS slug,
MAX(shipments.id) AS shipment_id,
statuses.id AS status_id,
FROM
`shipments`
INNER JOIN `shipment_status` ON `shipment_status`.`shipment_id` = `shipments`.`id`
INNER JOIN `statuses` ON `shipment_status`.`status_id` = `statuses`.`id`
GROUP BY
`shipment_id`
Because we need to reference other fields from the same record that evaluates from the MAX aggregation, you need to do it in two steps, there are other ways, but I find this syntax simpler:
SELECT
shipments.id AS id,
statuses.slug AS slug,
statuses.id AS status_id,
shipment_status.shipment_id as shipment_id
FROM
`shipments`
INNER JOIN `shipment_status` ON `shipment_status`.`shipment_id` = `shipments`.`id`
INNER JOIN `statuses` ON `shipment_status`.`status_id` = `statuses`.`id`
WHERE
shipment_status.id = (
SELECT MAX(shipment_status.id)
FROM `shipment_status`
WHERE shipment_status.shipment_id = shipments.id
)
This query makes the assumption that the id
field is an identity column, so the MAX(shipment_status.id)
represents only the most recent status
for the given shipment_id