Let's say the following tables exist:
a user
table with user_id
an item
table with item_id
and a vote
table with user_id, item_id, value
where value
can be the integers 0 through 3. Users can vote on items.
I am looking for a way to perform a query such that I can find all users that have voted the same as an input user with a set of specified items and corresponding vote values.
For example, say I send a request to the app server with the following information:
user_id: 5
votes: [
{ item_id: 7, vote_value: 0 },
{ item_id: 3, vote_value: 3 },
{ item_id: 5, vote_value: 1 },
{ item_id: 41, vote_value: 3 },
{ item_id: 23, vote_value: 2 }
]
I would like to then return all the users that at least share these results (that is, all other users that voted 0 on item 7, voted 3 on item 3, voted 1 on item 5, voted 3 on item 41, and voted 2 on item 23). They can have more votes and other votes on other items, but at least those must match. Also, five voted items passed in is just an example. There are an arbitrary amount.
The only solution I have come up with so far involves a quad join and having a column for all 0 votes, all 1 votes, all 2 votes, and all 3 votes for each user, and then returning that entire dataset to the app-server to perform calculations there to find matches. But I was hoping there is a better, more performant way.
One way to do it
SELECT user_id
FROM votes
GROUP BY user_id
HAVING MAX(CASE WHEN (item_id, value) = (7, 0) THEN 1 ELSE 0 END) = 1
AND MAX(CASE WHEN (item_id, value) = (3, 3) THEN 1 ELSE 0 END) = 1
AND MAX(CASE WHEN (item_id, value) = (5, 1) THEN 1 ELSE 0 END) = 1
AND MAX(CASE WHEN (item_id, value) = (41, 3) THEN 1 ELSE 0 END) = 1
AND MAX(CASE WHEN (item_id, value) = (23, 2) THEN 1 ELSE 0 END) = 1
The HAVING
clause can also be expressed in the following manner
...
HAVING MAX(CASE WHEN (item_id, value) = (7, 0) THEN 1 ELSE 0 END)
+ MAX(CASE WHEN (item_id, value) = (3, 3) THEN 1 ELSE 0 END)
+ MAX(CASE WHEN (item_id, value) = (5, 1) THEN 1 ELSE 0 END)
+ MAX(CASE WHEN (item_id, value) = (41, 3) THEN 1 ELSE 0 END)
+ MAX(CASE WHEN (item_id, value) = (23, 2) THEN 1 ELSE 0 END) = 5