Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01dv13zt22m
Title: | Game theory and optimization in boosting |
Authors: | Mukherjee, Indraneel |
Advisors: | Schapire, Robert E. |
Contributors: | Computer Science Department |
Keywords: | boosting game theory optimization |
Subjects: | Computer science |
Issue Date: | 2011 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | Boosting is a central technique of machine learning, the branch of artificial intelligence concerned with designing computer programs that can build increasingly better models of reality as they are presented with more data. The theory of boosting is based on the observation that combining several models with low predictive power can often lead to a significant boost in the accuracy of the combined meta-model. This approach, introduced about twenty years ago, has been a prolific area of research, and has proved immensely successful in practice. However, despite extensive work, many basic questions about boosting remain unanswered. In this thesis, we increase our understanding of three such theoretical aspects of boosting. In Chapter 2 we study the convergence properties of the most well known boosting algorithm, AdaBoost. Rate bounds for this important algorithm are known for only special situations that rarely hold in practice. Our work guarantees fast rates hold under all situatons, and the bounds we provide are optimal. Apart from being important for practitioners, this bound also has implications for the statistical properties of AdaBoost. Like AdaBoost, most boosting algorithms are used for classification tasks, where the object is to create a model that can categorize relevant input data into one of a finite number of different classes. The most commonly studied setting is binary classification, when there are only two possible classes, although the tasks arising in practice are almost always multiclass in nature. In Chapter 3 we provide a broad and general framework for studying boosting for multiclass classification. Using this approach, we are able to identify for the first time the minimum assumptions under which boosting the accuracy is possible in the multiclass setting. Such theory existed previously for boosting for binary classification, but straightforward extensions of that to the multiclass setting lead to assumptions that are either too strong or too weak for boosting to be effectively possible. We also design boosting algorithms using these minimal assumptions, which work in more general situations than previous algorithms that assumed too much. In the final chapter, we study the problem of learning from expert advice which is closely related to boosting. The goal is to extract useful advice from the opinions of a group of experts even when there is no consensus among the experts themselves. Although algorithms for this task enjoying excellent guarantees have existed in the past, these were only approximately optimal, and exactly optimal strategies were known only when the experts gave binary ``yes/no'' opinions. Our work derives exactly optimal strategies when the experts provide probabilistic opinions, which can be more nuanced than deterministic ones. In terms of boosting, this provides the optimal way of combining individual models that attach confidence rating to their predictions indicating predictive quality. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01dv13zt22m |
Alternate format: | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Mukherjee_princeton_0181D_10001.pdf | 3.97 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.