WHAT MY TABLES LOOK LIKE:
mysql> select * from customer limit 3;
+-------------+---------------+-----------------------+------+--------+-------------+-------+
| customer_id | customer_name | profession | age | salary | town | state |
+-------------+---------------+-----------------------+------+--------+-------------+-------+
| 1 | Julio Sperski | Architect | 70 | 52016 | Conroe | TX |
| 2 | Micah Inchley | Biological scientist | 86 | 45355 | Omaha | NE |
| 3 | Brigg Denny | Chemist | 80 | 21754 | Bakersfield | CA |
+-------------+---------------+-----------------------+------+--------+-------------+-------+
3 rows in set (0.00 sec)
mysql> select * from vehicle limit 3;
+------------+---------------+--------------------+--------+--------+---------+----------------+--------------------+-----------------------+
| vehicle_id | vehicle_plate | registration_state | color | make | model | vehicle_type | per_day_rental_fee | per_day_insurance_fee |
+------------+---------------+--------------------+--------+--------+---------+----------------+--------------------+-----------------------+
| 1 | W9FLYC7 | TX | black | toyota | cruiser | mid size sedan | 44 | 27 |
| 2 | CA1CJIZ | NE | silver | ford | se | suv | 96 | 71 |
| 3 | HB5YI9A | CA | silver | dodge | mpv | truck | 26 | 28 |
+------------+---------------+--------------------+--------+--------+---------+----------------+--------------------+-----------------------+
3 rows in set (0.00 sec)
mysql> select * from rental limit 3;
+-----------+-------------+------------+-------------------+--------------------+-------------------------+----------------------------+
| rental_id | customer_id | vehicle_id | start_rental_date | return_rental_date | per_day_rental_fee_paid | per_day_insurance_fee_paid |
+-----------+-------------+------------+-------------------+--------------------+-------------------------+----------------------------+
| 1 | 32 | 4 | 3/4/2019 | 3/6/2019 | no | no |
| 2 | 42 | 39 | 3/23/2019 | 3/24/2019 | yes | yes |
| 3 | 33 | 14 | 10/18/2020 | 10/24/2020 | no | no |
+-----------+-------------+------------+-------------------+--------------------+-------------------------+----------------------------+
3 rows in set (0.00 sec)
customer_id, vehicle_id, and rental_id are primary keys.
customer_id and vehicle_id are foreign keys pointing to customer and vehicle tables respectively.
So I'm trying to grab a 3rd column and display it. The query is something along the lines of this:
For each registration state that a rental car company operates in, return the most amount of customers from a certain customer state. Like if there are 5 customers that are from TX but are renting a car registered in CA and there are only 2 customers from NY renting cars registered in CA, I would return the 5 from TX renting in CA and so on until I finish all the registration states.
FIRST QUERY, The closest I have gotten is with this query:
SELECT registration_state,
Max(count) AS count
FROM (SELECT registration_state,
Count(a.state) AS count,
a.state
FROM rental c
LEFT JOIN customer a
ON a.customer_id = c.customer_id
LEFT JOIN vehicle b
ON c.vehicle_id = b.vehicle_id
GROUP BY registration_state,
a.state)Z
GROUP BY registration_state
ORDER BY registration_state,
count DESC;
+--------------------+-------+
| registration_state | count |
+--------------------+-------+
| AL | 1 |
| CA | 5 |
| DC | 1 |
| DE | 2 |
| FL | 3 |
| IL | 2 |
| IN | 1 |
| MD | 2 |
| MI | 1 |
| MN | 1 |
| MO | 1 |
| NE | 1 |
| NV | 2 |
| NY | 3 |
| OH | 2 |
| OR | 1 |
| PA | 1 |
| SC | 1 |
| TN | 3 |
| TX | 7 |
| WA | 1 |
+--------------------+-------+
21 rows in set (0.01 sec)
However it displays only the registration_state and count of whichever customer state rents the most in that state that the rental car company rents cars in but without the customer state displayed, I want the customer state to be displayed.
SECOND QUERY, The following query generates and returns counts of the amount of rentals from each customer state for every registration state:
SELECT registration_state,
Count(d.state) AS count,
d.state
FROM rental f
LEFT JOIN customer d
ON d.customer_id = f.customer_id
LEFT JOIN vehicle e
ON f.vehicle_id = e.vehicle_id
GROUP BY registration_state,
d.state
ORDER BY registration_state,
count DESC;
+--------------------+-------+-------+
| registration_state | count | state |
+--------------------+-------+-------+
| AL | 1 | NY |
| CA | 5 | CA |
| CA | 4 | MO |
| CA | 2 | IN |
| CA | 2 | TN |
| CA | 2 | TX |
| CA | 2 | OH |
| CA | 1 | FL |
| CA | 1 | AL |
| CA | 1 | MI |
| CA | 1 | NE |
| CA | 1 | WA |
| DC | 1 | CA |
| DC | 1 | IL |
| DC | 1 | TX |
| DE | 2 | NY |
| DE | 1 | FL |
| FL | 3 | NY |
| FL | 1 | AL |
| FL | 1 | OH |
| FL | 1 | CA |
| FL | 1 | FL |
| FL | 1 | MI |
| FL | 1 | TX |
| IL | 2 | OR |
| IL | 1 | TX |
| IL | 1 | CA |
| IN | 1 | NV |
| IN | 1 | CA |
| MD | 2 | WA |
| MD | 1 | OH |
| MD | 1 | MD |
| MI | 1 | PA |
| MN | 1 | PA |
| MN | 1 | TN |
| MN | 1 | FL |
| MO | 1 | NY |
| MO | 1 | SC |
| MO | 1 | OH |
| MO | 1 | OR |
| MO | 1 | CA |
| MO | 1 | FL |
| MO | 1 | TX |
| NE | 1 | CA |
| NV | 2 | FL |
| NY | 3 | TX |
| NY | 1 | CA |
| NY | 1 | FL |
| NY | 1 | MO |
| NY | 1 | NY |
| OH | 2 | OH |
| OH | 1 | TN |
| OH | 1 | NE |
| OH | 1 | PA |
| OH | 1 | DC |
| OH | 1 | NY |
| OR | 1 | IN |
| OR | 1 | CA |
| PA | 1 | MO |
| PA | 1 | DC |
| SC | 1 | FL |
| SC | 1 | NY |
| TN | 3 | TX |
| TN | 2 | CA |
| TN | 1 | FL |
| TN | 1 | PA |
| TN | 1 | MI |
| TN | 1 | OH |
| TN | 1 | OR |
| TN | 1 | MO |
| TX | 7 | NY |
| TX | 4 | TX |
| TX | 3 | FL |
| TX | 2 | MO |
| TX | 2 | MD |
| TX | 2 | DC |
| TX | 1 | IN |
| TX | 1 | OH |
| TX | 1 | CA |
| TX | 1 | NV |
| TX | 1 | OR |
| TX | 1 | IL |
| WA | 1 | TN |
| WA | 1 | MI |
+--------------------+-------+-------+
84 rows in set (0.00 sec)
As you can see in the above table, if you compare it with the first table, you will see that the first table only returns the customer state that has the most rentals in that registration state.
I am trying to get it to look like this:
+--------------------+-------+-------+
| registration_state | count | state |
+--------------------+-------+-------+
| AL | 1 | NY |
| CA | 5 | CA |
| DC | 1 | CA |
| DE | 2 | NY |
| FL | 3 | NY |
| IL | 2 | OR |
| IN | 1 | NV |
| MD | 2 | WA |
| MI | 1 | PA |
| MN | 1 | PA |
| MO | 1 | NY |
| NE | 1 | CA |
| NV | 2 | FL |
| NY | 3 | TX |
| OH | 2 | OH |
| OR | 1 | IN |
| PA | 1 | MO |
| SC | 1 | FL |
| TN | 3 | TX |
| TX | 7 | NY |
| WA | 1 | TN |
+--------------------+-------+-------+
I have tried joining like so:
SELECT registration_state,
MAX(count)
FROM (
SELECT registration_state,
Count(d.state) AS count,
d.state
FROM rental f
LEFT JOIN customer d
ON d.customer_id = f.customer_id
LEFT JOIN vehicle e
ON f.vehicle_id = e.vehicle_id
GROUP BY registration_state,
d.state)W
RIGHT JOIN
(
SELECT registration_state AS registration_state2,
count(a.state) AS count2,
a.state
FROM rental c
LEFT JOIN customer a
ON a.customer_id = c.customer_id
LEFT JOIN vehicle b
ON c.vehicle_id = b.vehicle_id
GROUP BY registration_state,
a.state )z
ON z.count2=W.count
AND z.registration_state2=W.registration_state
GROUP BY registration_state
ORDER BY registration_state;
but it just returns the same as the first query, if I add state from either of the subqueries to the select statement in the first line and also group by said state at the end, I end up with the second query.
Are there any suggestions on how to get it to return the way I want?
If you are running MySQL 8.0, you can use window functions to bring the customer state with most rentals for each registration state:
select registration_state, state, cnt
from (
select v.registration_state, c.state, count(*) as cnt,
rank() over(partition by v.registration_state order by count(*) desc) rn
from rental r
inner join customer c on c.customer_id = r.customer_id
inner join vehicle v on r.vehicle_id = v.vehicle_id
group by v.registration_state, c.state
) t
where rn = 1
order by registration_state;
In ealier versions, it is a bit more complicated. One option uses a row-limiting correlated subquery in the having
clause:
select v.registration_state, c.state, count(*) as cnt
from rental r
inner join customer c on c.customer_id = r.customer_id
inner join vehicle v on r.vehicle_id = v.vehicle_id
group by v.registration_state, c.state
having count(*) = (
select count(*)
from rental r1
inner join customer c1 on c1.customer_id = r1.customer_id
inner join vehicle v1 on r1.vehicle_id = v1.vehicle_id
where c1.state = c.state and v1.registration_state = v.registration_state
group by v1.registration_state, c1.state
order by count(*) desc limit 1
)
Note that both queries allow top ties, if any.