I have a table which represents friendships between two users. Each entry is unidirectional; a friendship requires two entries to be represented. I would like to keep it this way.
user1_id | user2_id
43 44
44 43
I'm curious what the best way to query against this kind of setup is. For example, how would query for a list of all friends of a particular user?
The solution I've come up with is to invert and inner join the friendships table onto itself, to get a list of all complete pairs first, then use a normal WHERE clause:
SELECT f.user2_id
FROM friendships f
INNER JOIN friendships f2
ON f.user1_id = f2.user2_id
&& f.user2_id = f2.user1_id
WHERE f.user1_id = 43;
To get into a juicier situation, I would also need to be able to query to get a list of threads created by friends of a user. I'm still adjusting to the abstract thinking required to work with joins, but the solution I've come up with is this:
SELECT thread_id,owner_id,message,time
FROM threads th
INNER JOIN friendships f
ON th.owner_id = f.user2_id
&& f.user1_id = 43
INNER JOIN friendships f2
ON f.user2_id = f2.user1_id
&& f.user1_id = f2.user2_id
This seems to work, but I'm testing it on a very empty database. Since my grasp on joins is still a little weak, I'm afraid that there are situations in which this will come back with bad results. So my questions are:
Let's start to answer this question by creating a query which gives a list of all (user, friend) pairs. You've required that a friend pair be a reciprocal relationship, so we need to do that. Otherwise the friendships
table would offer those pairs.
Your query is pretty close. What you need for this query is the inner join.
SELECT f1.user1_id AS user,
f1.user2_id AS friend
FROM friendships AS f1
JOIN friendships f2
ON (f1.user1_id = f2.user2_id AND f1.user2_id = f2.user1_id)
Then, you can employ this query as a virtual table. For example, you can do this to get a list of the friends of user 43.
SELECT friend
FROM (
SELECT f1.user1_id AS user,
f1.user2_id AS friend
FROM friendships AS f1
JOIN friendships f2
ON (f1.user1_id = f2.user2_id AND f1.user2_id = f2.user1_id)
) AS friends
WHERE user = 43
You might want to set your friends query up as a view, like so.
CREATE VIEW friends AS (
SELECT f1.user1_id AS user,
f1.user2_id AS friend
FROM friendships AS f1
JOIN friendships f2
ON (f1.user1_id = f2.user2_id AND f1.user2_id = f2.user1_id)
)
That way you can abbreviate your complex queries like this.
SELECT friend FROM friends WHERE user = 43;
Your juicier queries get easy too:
SELECT thread_id,owner_id,message,time
FROM threads AS th
JOIN (
SELECT f1.user1_id AS user,
f1.user2_id AS friend
FROM friendships AS f1
JOIN friendships f2
ON (f1.user1_id = f2.user2_id AND f1.user2_id = f2.user1_id)
) AS f ON th.owner_id = f.friend
WHERE f.user = 43
If you use the view you can do this. It means the same thing but is easier to read.
SELECT thread_id,owner_id,message,time
FROM threads AS th
JOIN friends AS f ON th.owner_id = f.friend
WHERE f.user = 43
(Note: JOIN
and INNER JOIN
are synonymous.)
See how that goes? You can encapsulate your friends
query (either as a view or just inline as a virtual table) and use its results. The optimizer knows how to do this quickly when you specify such things as WHERE f.user = 43
.
If you have lots of rows in the friendships
table you may find that a pair of compound indexes (user1_id, user2_id)
and (user2_id, user1_id)
help the performance of these queries
(It's not a recursive query, comments to the contrary notwithstanding.)