Search code examples
mysqlquery-performance

MySQL joining tables of different structure


I am having trouble either getting any result or a correct result in the following problem - http://www.sqlfiddle.com/#!9/696ed2/4

Overall goal is to list all transactions of users who are linked together as 'Customers'. So if John is looking at his dashboard, he will see which books Alice (his customer) has rented (including book title), and which books were sold (he won't be able to see the title of that book).

When two multiple tables are joined to the parent table, where both depending tables have an 'active' flag set against each row, I can't seem to get only active rows.

# USERS 
CREATE TABLE `users` (
  `id` int(11) NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  `active` boolean DEFAULT NULL
);

INSERT INTO `users` (`id`, `name`, `active`) VALUES
(1, 'John', 1),
(2, 'Alice', 1),
(3, 'Jess', 1),
(4, 'Bob', 1);

# BOOKS 
CREATE TABLE `books` (
  `id` int(11) NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  `active` boolean DEFAULT NULL
);

INSERT INTO `books` (`id`, `name`, `active`) VALUES
(1, 'On the Road', 1),
(2, 'Neuromancer', 0),
(3, 'Modern History', 1),
(4, 'Red Mars', 1);

# TRANSACTIONS
CREATE TABLE `transactions` (
  `id` int(11) NOT NULL,
  `user_1_id` int(11) NOT NULL,
  `user_2_id` int(11) DEFAULT NULL,
  `book_id` int(11) DEFAULT NULL,
  `timestamp` varchar(20) DEFAULT NULL,
  `type` enum('Rent', 'Sold') NOT NULL
);

INSERT INTO `transactions` (`id`, `user_1_id`, `user_2_id`, `book_id`,     `timestamp`, `type`) VALUES
(1, 1, 2, 1, '1465238591', 'Rent'),
(2, 2, 1, 2, '1465238592', 'Rent'),
(3, 2, 3, 3, '1465238593', 'Rent'),
(4, 3, 4, NULL, '1465238594', 'Sold'),
(5, 2, 3, NULL, '1465238595', 'Sold'),
(6, 3, 4, NULL, '1465238596', 'Sold'),
(7, 2, 2, 4, '1465238597', 'Rent'),
(8, 1, 3, 1, '1465238598', 'Rent'),
(9, 2, 4, 2, '1465238598', 'Rent');

# RELATIONSHIPS
CREATE TABLE `relationships` (
  `id` int(11) NOT NULL,
  `user_1_id` int(11) DEFAULT NULL,
  `user_2_id` int(11) NOT NULL,
  `type` enum('Customer', 'Supplier') NOT NULL
);

INSERT INTO `relationships` (`id`, `user_1_id`, `user_2_id`, `type`) VALUES
(1, 1, 2, 'Customer'),
(2, 2, 1, 'Customer'),
(3, 2, 4, 'Customer'),
(3, 1, 4, 'Supplier'),
(3, 3, 1, 'Customer');

Query:

SELECT DISTINCT 
  t.id,
  t.type,
  t.timestamp,
  u1.name as user_1_name,
  u2.name as user_2_name,
  b.name as book_name
  FROM transactions t

  LEFT JOIN relationships r ON (r.user_1_id = 1 AND r.type = 'Customer')
  LEFT JOIN books b ON (b.id = t.book_id AND b.active)
  LEFT JOIN users u1 ON (u1.id = t.user_1_id) # AND u1.active
  LEFT JOIN users u2 ON (u2.id = t.user_2_id) # AND u2.active

  WHERE (r.user_2_id = t.user_1_id OR t.user_2_id = 1 AND t.user_1_id != 1)
    # AND b.active AND u1.active AND u2.active

[Result]:

| id | type |  timestamp | user_1_name | user_2_name |      book_name |
|----|------|------------|-------------|-------------|----------------|
|  3 | Rent | 1465238593 |       Alice |        Jess | Modern History |
|  2 | Rent | 1465238592 |       Alice |        John |         (null) | <<< Should not be here
|  7 | Rent | 1465238597 |       Alice |       Alice |       Red Mars |
|  5 | Sold | 1465238595 |       Alice |        Jess |         (null) | <<< Correct
|  9 | Rent | 1465238598 |       Alice |         Bob |         (null) | <<< Should not be here

In the above result, the problem is that book Neuromancer has flag 'active' set to 0, so should not appear in the result. I have played with placing AND b.active at different places, but the results are always wrong. (See http://www.sqlfiddle.com/#!9/696ed2/5)

Looking at the mess above, I am not even sure my approach is any good, any suggestions are welcome.


Solution

  • As D. Smania mentioned in comments you need to make sure b.active is either 1 or NULL but based on your LEFT JOIN condition b.active will always be 1 so you need to do the join only on id and rely on the WHERE condition for comparison. This should yield the results you asked for:

    SELECT DISTINCT 
        t.id,
        t.type,
        t.timestamp,
        u1.name AS user_1_name,
        u2.name AS user_2_name,
        b.name AS book_name
    FROM transactions t
        LEFT JOIN relationships r ON (r.user_1_id = 1 AND r.type = 'Customer')
        LEFT JOIN books b ON (b.id = t.book_id)
        LEFT JOIN users u1 ON (u1.id = t.user_1_id)
        LEFT JOIN users u2 ON (u2.id = t.user_2_id)
    WHERE (r.user_2_id = t.user_1_id OR t.user_2_id = 1 AND t.user_1_id != 1)
        AND (b.active OR b.active IS NULL)
        AND u1.active AND u2.active
    

    SQL Fiddle

    One note - in your first WHERE condition it's not clear to me whether you mean:

    (r.user_2_id = t.user_1_id OR (t.user_2_id = 1 AND t.user_1_id != 1))
    

    or

    ((r.user_2_id = t.user_1_id OR t.user_2_id = 1) AND t.user_1_id != 1)
    

    It's always best to be explicit with your logic grouping when you have adjacent AND and OR conditions.