Solved – evaluating the performance of item-based collaborative filtering for binary (yes/no) product recommendations

machine learningrecommender-system

I'm attempting to write some code for item based collaborative filtering for product recommendations. The input has buyers as rows and products as columns, with a simple 0/1 flag to indicate whether or not a buyer has bought an item. The output is a list similar items for a given purchased, ranked by cosine similarities.

I am attempting to measure the accuracy of a few different implementations, but I am not sure of the best approach. Most of the literature I find mentions using some form of mean square error, but this really seems more applicable when your collaborative filtering algorithm predicts a rating (e.g. 4 out of 5 stars) instead of recommending which items a user will purchase.

One approach I was considering was as follows…

  • split data into training/holdout sets, train on training data
  • For each item (A) in the set, select data from the holdout set where users bought A
  • Determine which percentage of A-buyers bought one of the top 3 recommendations for A-buyers

The above seems kind of arbitrary, but I think it could be useful for comparing two different algorithms when trained on the same data.

Best Answer

Recommenderlab is an R package that has built-in functionality to train and evaluate item-item CF based on binary user-item matrix. If nothing else, reading the package vignette might give some ideas. The author of the package also has a paper on evaluating top-n recommendations based on binary user-item data. Links below. The pdf's have the same name but they are diff content.

http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf

http://machinelearning202.pbworks.com/w/file/fetch/44663680/recommenderlab.pdf