Search code examples
rrecommendation-enginecollaborative-filtering

Wrong output in predict function from R package 'recommenderlab'?


I need to make a recommender based on a Yelp database, I've filtered the business reviews and the user and created a realRatingMatrix with user ratings for the respective businesses. Even though the matrix will be gigantic I'm just testing with a very small matrix first (mdat matrix).

#learning matrix
learningM <- as(mdat[1:8,],"realRatingMatrix")

# matrix to predict user recommendations
testM <- as(mdat[9:10,],"realRatingMatrix")

#using the learning matrix to create a UBCF recommender
rec <- Recommender(learningM, method = "UBCF")

#function that should output 2 business recommendations to users of testM
pre <- predict(rec, testM, n=2)

Instead I receive an output like this:

> as(pre,"list")
[[1]]
character(0)

[[2]]
character(0)

Why am I getting this output? Is the predict function calculating wrong results thus providing erroneous output or is my business column name a different text type so he can't output it correctly?

Edit: mdat matrix as requested, sorry for not putting it at the first place.

> dput(mdat)
structure(c(1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, 5, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, 4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, 4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
3, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 5, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, 5), .Dim = c(10L, 10L), .Dimnames = list(
c("jqhP9mV2rYvmPdKvlOfp0g", "tqkkmL2NB19Sxeg1AaXnSA", "cNMJxBzmXA9N7krLvlrzlA", 
"9v3uIUTitC043Y4Qs54K1g", "nLUwyI34R-cAHLnmEGeLIg", "6SUSTwhfSFva9nbIDmoN7Q", 
"iCppbv3C7XvCyzIZnNQ7fg", "MUo7TLgR7sy1ob0MvxyPHQ", "GMVQyHMHNGplG3aof8jMcA", 
"VNGevHJuTxcou-Nhm8Q5RQ"), c("iZYDZvXoIT648EZOnEP0pQ", "HQJjHA6BRcRD0vR5askdkQ", 
"bul_5Ahk_QYLUAJ4Od27jg", "EOoj2h1Brzk1AhqScvIHDA", "roEQNfyPi3jRv3WFFr-f_g", 
"ffp58kYSK7dJGs5ER-5txw", "pvlM--HZY1a8SqMXiwEz1A", "mta3FuoNzjjGWQr9TCHGhA", 
"QeK3lOP-CTZS72YgeXiiqA", "57VozB9tq5SbNst9nO-jxA")))

Solution

  • As explained by the vignette, recommenderlab is trying to solve a sparse matrix regression problem. But if the test set of recommendations from new users is not from the same space set of values as the training set, then there can be no construction of any sort of similarity measure for the new users with the old users. The algorithm needs to find some old-users who rated the same items as the new users before it can then deliver further items from the other old user ratings.

    A simple populatity recommendation does not require finding any subset of raters who shared ratings with new users.