Categorized | Social Network

Recommendation Systems: Interview with Satnam Alag

  • Sharebar

ReadWriteWeb: Netflix is offering $1 million to the team that can improve its recommendation algorithm by 10%. It’s been over 2 years now, with the leading company at 9.63%. There is some skepticism, though, that 10% will be reached anytime soon, because now the contestants are making only incremental progress. Do you expect the 10% mark to be reached soon?

Satnam: Netflix’s recommendation engine, Cinematch, uses an item-to-item algorithm (similar to Amazon’s) with a number of heuristics. Given that Netflix’ recommendation system has been very successful in the real world, it is pretty impressive that teams have been able to improve on it by as much as 9.63%. Of course, the Netflix competition doesn’t take into account speed of implementation or the scalability of the approach. It simply focuses on the quality of recommendations in terms of closing the gap between user rating and predicted rating. So, it isn’t clear whether Netflix will be able to leverage all of the innovation coming out of this competition. Also, the Netflix data doesn’t contain much information to allow for a content-based approach; it’s for this reason that teams are focusing on collaborative-based techniques.

The challenges to reaching the 10% mark are:

Skewed data: The data set for the competition consists of more than 100 million anonymous movie ratings, using a scale of one to five stars, made by 480,000 users for 17,770 movies. Note that the user-item data set for this problem is sparsely populated, with nearly 99% of user-item entries being zero. The distribution of movies per user is skewed. The median number of ratings per user is 93. About 10% of users rated 16 or fewer movies, while 25% of users rated 36 or fewer. Two users rated as many as 17,000 movies. Similarly, the ratings per movie are also skewed: nearly half the user base rated one well loved movie (Miss Congeniality); about 25% of movies had 190 or fewer ratings; and a handful of movies were rated fewer than 10 times.

The approach: The winning team, BellKor, spent more than 2,000 combined hours poring over data to find the winning solution. The winning solution was a linear combination of 107 sets of predictions. Many of the algorithms involved either the nearest-neighbor method (k-NN) or latent factor models, such as SVD/factorization and Restricted Boltzmann Machines (RBMs).

The winning solution uses k-NN to predict the rating for a user, using both the Pearson-r correlation and cosine methods to compute the similarities, with corrections to remove item-specific and user-specific biases. Latent semantic models are also widely used in the winning solution.

The BellKor team found it vital to use a variety of models that compensated for each other’s shortcomings. No one model alone could have gotten the BellKor team to the top of the competition. The combined set of models achieved an improvement of 8.43% over Cinematch, while the best model — a hybrid of k-NN applied to output from RBMs — improved the result by 6.43%. The largest improvement by LSI methods was 5.1%, with the best pure k-NN model scoring below that. (K for the k-NN methods was in the range of 20 to 50.) The BellKor team also applied a number of heuristics to further improve the results.

The BellKor team demonstrates a number of guidelines for building a winning solution to this kind of competition:

  • Combining complementary models helps improve the overall solution. Note that a linear combination of three models, one each for k-NN, LSI, and RBM, would have yielded honestly excellent results, an improvement of 7.58%.
  • A principled approach is needed to optimize the solution.
  • The key to winning is building models that can accurately predict when there is sufficient data, without over-applying in the absence of adequate data.

The final solution will be along the same lines, combining multiple models with heuristics. Contestants will probably reach the magic 10% mark in the next year or two.

ReadWriteWeb: Some people reckon the 10% mark can’t be reached with algorithms alone, but that the “human” element is required. For example, ClerkDogs is a service that hires actual former video-store clerks to “make a database that is much richer and deeper than the collaborative filtering engines.” It’s a similar approach to that of Pandora, which has 50 employees who listen to and tag songs. How far do you reckon algorithms can go in making recommendations?

Satnam: Recommendation systems are not perfect. A number of elements go into making successful ones, including approach, the speed of computing results, heuristics, the exploration and exploitation of coefficients, and so on. But it has been shown in the real world that the more personalized you can make recommendations, the higher the click-through rate, the stickier the application, and the lower the bounce rate.

Using humans to form a rich database for recommendations may work for small applications, but it would probably be too expensive to scale. I don’t see them competing against each other, human versus machine. Even with human/expert recommendations, one first needs to find a human/expert with tastes similar to those of the user, especially if you want to go after the long tail.

Discuss

Read More




This post was written by:

- who has written 179 posts on Indometric.


Contact the author

Leave a Reply

Partner Links

top online storage