I need to make a recommender based on a Yelp database, I've filtered the business reviews and the user and created a realRatingMatrix
with user ratings for the respective businesses. Even though the matrix will be gigantic I'm just testing with a very small matrix first (mdat
matrix).
#learning matrix
learningM <- as(mdat[1:8,],"realRatingMatrix")
# matrix to predict user recommendations
testM <- as(mdat[9:10,],"realRatingMatrix")
#using the learning matrix to create a UBCF recommender
rec <- Recommender(learningM, method = "UBCF")
#function that should output 2 business recommendations to users of testM
pre <- predict(rec, testM, n=2)
Instead I receive an output like this:
> as(pre,"list")
[[1]]
character(0)
[[2]]
character(0)
Why am I getting this output? Is the predict
function calculating wrong results thus providing erroneous output or is my business column name a different text type so he can't output it correctly?
Edit: mdat
matrix as requested, sorry for not putting it at the first place.
> dput(mdat)
structure(c(1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 5, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 4, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
3, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 5, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, 5), .Dim = c(10L, 10L), .Dimnames = list(
c("jqhP9mV2rYvmPdKvlOfp0g", "tqkkmL2NB19Sxeg1AaXnSA", "cNMJxBzmXA9N7krLvlrzlA",
"9v3uIUTitC043Y4Qs54K1g", "nLUwyI34R-cAHLnmEGeLIg", "6SUSTwhfSFva9nbIDmoN7Q",
"iCppbv3C7XvCyzIZnNQ7fg", "MUo7TLgR7sy1ob0MvxyPHQ", "GMVQyHMHNGplG3aof8jMcA",
"VNGevHJuTxcou-Nhm8Q5RQ"), c("iZYDZvXoIT648EZOnEP0pQ", "HQJjHA6BRcRD0vR5askdkQ",
"bul_5Ahk_QYLUAJ4Od27jg", "EOoj2h1Brzk1AhqScvIHDA", "roEQNfyPi3jRv3WFFr-f_g",
"ffp58kYSK7dJGs5ER-5txw", "pvlM--HZY1a8SqMXiwEz1A", "mta3FuoNzjjGWQr9TCHGhA",
"QeK3lOP-CTZS72YgeXiiqA", "57VozB9tq5SbNst9nO-jxA")))
As explained by the vignette, recommenderlab is trying to solve a sparse matrix regression problem. But if the test set of recommendations from new users is not from the same space set of values as the training set, then there can be no construction of any sort of similarity measure for the new users with the old users. The algorithm needs to find some old-users who rated the same items as the new users before it can then deliver further items from the other old user ratings.
A simple populatity recommendation does not require finding any subset of raters who shared ratings with new users.