Search code examples
sqldata-manipulation

SQL - How to solve this challenging problem?


I have two tables

First table - ticket history:

customer_id ticket_price transportation company_id
1 $342.21 Plane D7573
1 $79.00 Car G2943
1 $91.30 Car M3223
2 $64.00 Car K2329
3 $351.00 Plane H2312
3 $354.27 Plane P3857
4 $80.00 Car N2938
4 $229.67 Plane J2938
5 $77.00 Car L2938

2nd table - companies and corresponding vehicles:

company_id vehicle
D7573 Boeing
G2943 Coach
M3223 Shuttle
K2329 Shuttle
H2312 Airbus
P3857 Boeing
N2938 Minibus
J2938 Airbus
L2938 Minibus
Z3849 Airbus
A3848 Minibus

If a customer took both plane and car, then they are "mixed". Otherwise they are "plane" or "car" customers. How can I get the result below?

# shuttle took Avg ticket price per customer # of customers
mixed ?????????????? ???????????????????????????? ??????????????
plane ?????????????? ???????????????????????????? ??????????????
car ?????????????? ???????????????????????????? ??????????????

Solution

  • Your title is misleading, you need to specify which part you are having problem.

    May not be the best answer. Tested in MYSQL env, sql fiddle

        select transportation,
        sum(no_of_shuttle) as no_of_shuttle_took,
        round(avg(ticket_price), 2) as avg_price_per_customer,
        count(customer_id) as no_of_customer
        from (
              select 
              customer_id, 
              'mixed' as transportation, 
              count(transportation) as no_of_shuttle,
              sum(ticket_price) as ticket_price
              from tickets
              group by customer_id
              having count(distinct transportation) > 1
    
              union all
    
              select 
              customer_id, 
              transportation, 
              count(transportation) as no_of_shuttle,
              sum(ticket_price) as avg_ticket_price
              from tickets
              group by customer_id
              having count(distinct transportation) = 1
             ) t
        group by transportation
    

    I am using subqueries to aggregate

    1. customers with multiple distinct transportation type
    2. customers with single distinct transportation type

    Then I union these two results into one result set to further calculate the number of customers, number of shuttle took and average ticket price per customer. Note that I am rounding the price to 2 decimal places.