Skip to Content

<b>COMPLEX SYSTEMS SEMINAR</b><br>Regulating Greed Over Time: An Important Lesson For Practical Recommender Systems

Tuesday, April 5, 2016
12:00 AM
411 West Hall

There is an important aspect of practical recommender systems that we noticed while competing in the ICML Exploration-Exploitation 3 data mining competition. The goal of the competition was to build a better recommender system for Yahoo!'s Front Page, which provides personalized new article recommendations. The main strategy we used was to carefully control the balance between exploiting good articles and exploring new ones in the multi-armed bandit setting. This strategy was based on our observation that there were clear trends over time in the click-through-rates of the articles. At certain times, we should explore new articles more often, and at certain times, we should reduce exploration and just show the best articles available. This led to dramatic performance improvements.

As it turns out, the observation we made in the Yahoo! data is in fact pervasive in settings where recommender systems are currently used. This observation is simply that certain times are more important than others for correct recommendations to be made. This affects the way exploration and exploitation (greed) should change in our algorithms over time. We thus formalize a setting where regulating greed over time can be provably beneficial. This is captured through regret bounds and leads to principled algorithms. The end result is a framework for bandit-style recommender systems in which certain times are more important than others for making a correct decision.

If time permits, I will discuss a separate project, which is an approach to decision tree (rule list) learning. This method does not have the disadvantage of greedy splitting and pruning that haunts decision tree algorithms. It yields sparse logical models in a computationally efficient way. It is a fierce competitor for decision tree methods on a wide variety of problems, and it is more principled.

The work on multi-armed bandits is joint work with Stefano Traca, Ed Su, and Ta Chiraphadhanakul. The work on decision lists is joint with Hongyu Yang and Margo Seltzer.