In the usual setup of supervised learning, there is little interaction between human and machine: the human labels a data set and then departs; and at some later time, the machine is started up, given the data, and told to find a good classifier.
"Interactive learning" refers to scenarios in which the human engages with the machine while learning is taking place. There are countless ways in which this could happen, for instance:
- The machine may request labels of just a few points that are chosen adaptively, rather than requiring everything to be labeled in advance.
- If asked, the human may indicate relevant features, for instance by highlighting words in a document that are highly indicative of its label.
- For traditionally unsupervised tasks like clustering or embedding, an iterative refinement process can be used to bring the final result into line with the human's needs.
I will describe a general protocol for interactive learning that includes such scenarios and has generic learning algorithms. This framework also yields bounds on the interaction complexity of learning. I will illustrate these ideas with an interactive scheme for learning hierarchies that is simple and practical. It is based upon a novel and intuitive cost function for hierarchical clustering, and a fast algorithm for approximately optimizing it.
Sanjoy Dasgupta is a Professor in the Department of Computer Science and Engineering at UC San Diego. He received his PhD at UC Berkeley in 2000. His area of research is algorithmic statistics, with a focus on unsupervised and minimally supervised learning. He is the author of a textbook, "Algorithms" (with Christos Papadimitriou and Umesh Vazirani).
Faculty Hosts: Nina Balcan, Aarti Singh