Search code examples
mysqlsqldatabase-designnormalizationdatabase-normalization

What SQL query to access data from a table twice-removed, fully normalized


Let's say there are three tables, users, orders, transactions, which look like this:

Users: 
id | name
---------
1  | Mike
2  | Sara

Orders: 
id | user_id | total | transaction_id
-------------------------------------
1  |       1 |   100 |              1
2  |       2 |    10 |              2
3  |       1 |     5 |              1
4  |       1 |     4 |              3
5  |       1 |    40 |              3
6  |       2 |    80 |              2

Transactions: 
id | total |  created_at 
------------------------
1  |   105 | [timestamp]
2  |    90 | [timestamp]
3  |    44 | [timestamp]

And let's say I want to select users.name but sort the names representing the most recent activity (i.e. by most recent transaction).

As you can see the data I want is twice-removed (maybe misusing that word) in the sense that I would need to go through the orders table to figure out the most recent transaction.

I figured I would design it to be as normalized as possible but then I realized I had no idea how to structure a proper query for this and other similar scenarios. I can obviously pull all the data and sort it out in the application code. I could also of course add a "last_transaction_at" column on users that I keep in sync.

So I guess there's two pieces to the question. First is how to do this with a SQL query, and second is whether this is more or less efficient than de-normalizing.


Solution

  • Select u.Name, MAX(created_at) TransactionTime FROM #Users u
    INNER JOIN  #ORDERS o
    ON u.Id = o.user_id
    INNER JOIN #Transactions t
    ON o.transaction_id = t.id
    GROUP BY u.Name
    ORDER BY MAX(created_at) desc
    

    I feel the schema which you created in fine