Search code examples
sqlaggregationcollaborative-filtering

Collaborative Filtering with SQL


i'm trying to create a recommendation system with sql. Im using the following code:

CREATE TABLE TestTable (
  UserId INT,
  ProductId INT,
  Rating DECIMAL(3, 1),
  PRIMARY KEY (UserId, ProductId)
);


INSERT INTO TestTable (UserId, ProductId, Rating)
VALUES
  (1, 1001, 4.5),
  (1, 1002, 3.8),
  (1, 1003, 4.2),
  (1, 1004, 3.0),
  (2, 1002, 4.0),
  (2, 1003, 3.5),
  (2, 1005, 4.1),
  (3, 1001, 3.2),
  (3, 1003, 4.5),
  (3, 1004, 4.0),
  (4, 1001, 4.1),
  (4, 1002, 3.9),
  (4, 1004, 3.7),
  (4, 1005, 4.2);
  
  SELECT t2.ProductId
FROM TestTable t1 
INNER JOIN TestTable t2 ON t1.UserId = 1 AND t2.UserId IN (2, 3, 4)
     AND t2.ProductId NOT IN (
         SELECT ProductId
         FROM TestTable
         WHERE UserId = 1
     )
     AND t1.ProductId = t2.ProductId
GROUP BY t2.ProductId
HAVING AVG(t2.Rating) >= 3 
     AND COUNT(DISTINCT t2.UserId) >= 3; 

The code returns zero rows. Maybe anybody here has an idea :) Thanks in advance!


Solution

  • Here's the approach I would take:

    • gather similar users, according to items they both liked (matching product ids, similar avg rating, at least three products)
    • select products of those similar users, whose products are not yet liked by the original user
    WITH cte AS (
        SELECT t2.UserId
        FROM       TestTable t1 
        INNER JOIN TestTable t2 
                ON t1.UserId = 1 AND t2.UserId IN (2, 3, 4)
               AND t1.ProductId = t2.ProductId
        GROUP BY t1.UserId, t2.UserId
        HAVING COUNT(t2.UserId) >= 3
           AND ABS(AVG(t2.Rating - t1.Rating))<1
    )
    SELECT t1.ProductId
    FROM TestTable t1
    WHERE EXISTS(SELECT 1 FROM cte t2 WHERE t1.userid = t2.userid)
      AND NOT EXISTS(SELECT 1 FROM TestTable t2 WHERE userid = 1 AND t1.ProductID = t2.ProductId)
    

    Output:

    productid
    1005

    Here's a PostgreSQL demo, although this is likely to be working on the most common DBMSs.