Search code examples
mysqljoindata-warehouse

Basic Data Warehouse - How to join data 2 tables away


I have a basic data warehouse setup that represents the police force: Data warehouse for the police department

Every FACT_CRIME has an assigned DIM_OFFICER; every DIM_OFFICER is assigned to a DIM_STATION.

What SQL logic can I use to display all FACT_CRIMEs along with the associated DIM_STATION? My issue is that to obtain the DIM_STATION.station_code, I know it must be done via the DIM_OFFICER, since that is the relational link, but I'm not savvy enough with SQL to know how to get data from 2 tables away.

This is a query I have so far, it produces the format I'm after, but not the correct results:

select
    fact_crime.*, dim_station.station_code
from fact_crime
join dim_station
    on fact_crime.assigned_officer_id
        in (select dim_officer.officer_id from dim_officer where dim_officer.station_id = dim_station.station_id)
group by dim_station.station_code

Sample of produced results: (correct format, but incorrect data) Sample of produced results

I'm expecting every instance of FACT_CRIME to appear, with the assigned officer's station code additionally appearing in the table.

Thanks in advance to anyone who can point me in the right direction. I'm still trying to get to grips with the basics of SQL, so apologies if I require additional clarification on any responses.


Solution

  • You need two joins, someting like this (I'm guessing because you didn't show many details).

    ...
    from fact_crime c
    join dim_officer o ON c.assigned_officer_id = o.officer_id
    join dim_station s ON o.station_id = s.station_id
    

    This is a very common SQL pattern, so you'll find the MySQL software does a good job making it efficient.

    Pro tip: Avoid the * in SELECT *, especially from result sets that are results of lots of JOIN operations. Instead, use SELECT to give the list of columns you need.