Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01kp78gj709
Title: | Keyword-assisted LDA: Exploring New Methods for Supervised Topic Modeling |
Authors: | Abdurehman, Rahji |
Advisors: | Imai, Kosuke |
Department: | Computer Science |
Class Year: | 2015 |
Abstract: | This paper introduces an alternative to the popular machine learning algorithm known as Latent Dirichlet Allocation, or LDA for short. In this paper we derive the theory behind this alternative algorithm and demonstrate a specific use case for it with sample results. We call this new algorithm "keyword-assisted LDA". It works by taking a set of constraints which are set based on prior knowledge of the underlying topic structure within a corpus and then ensuring that they are maintained. Depending on one’s underlying implementation of LDA, keeping these constraints in order takes a variety of forms. This paper delves into the details for implementations using Gibbs sampling or Expectation-Maximization. |
Extent: | 36 pages |
URI: | http://arks.princeton.edu/ark:/88435/dsp01kp78gj709 |
Access Restrictions: | Walk-in Access. This thesis can only be viewed on computer terminals at the Mudd Manuscript Library. |
Type of Material: | Princeton University Senior Theses |
Language: | en_US |
Appears in Collections: | Computer Science, 1988-2020 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
PUTheses2015-Abdurehman_Rahji.pdf | 609.88 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.