Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01gf06g283w
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorLynch, Scott-
dc.contributor.authorSharma, Rohan-
dc.date.accessioned2014-07-17T19:28:54Z-
dc.date.available2014-07-17T19:28:54Z-
dc.date.created2014-05-
dc.date.issued2014-07-17-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01gf06g283w-
dc.description.abstractThe internet has revolutionized the distribution of news media, and further, allows users to quickly publicize responses to online news media through public commenting. The New York Times online commenting platform also allows registered users to \recommend" comments, thereby crowdsourcing a measure of comment quality. This research attempts to discover relationships between comment recommendation count, comment text and other associated metadata (e.g. newspaper section, time posted) by conducting an in-depth exploration of the New York Times comment dataset (2005 - 2013). In this paper, we review descriptive statistics of the dataset and applicable methods for metadata sourcing, text vectorization and supervised learning. We nd recommendation prediction is best mod- eled in terms of classi cation using the Naive Bayes learning algorithm. We are able to incorporate metadata features using classi er stacking, a form of ensemble learning, to boost performance. We then discuss the results in the broader context of user-generated internet content and crowdsourcing measures of content quality.en_US
dc.format.extent28 pagesen_US
dc.language.isoen_USen_US
dc.titleOptimal Commenting: Predictive Analytics On NYTimes Dataen_US
dc.typePrinceton University Senior Theses-
pu.date.classyear2014en_US
pu.departmentComputer Scienceen_US
pu.pdf.coverpageSeniorThesisCoverPage-
dc.rights.accessRightsWalk-in Access. This thesis can only be viewed on computer terminals at the <a href=http://mudd.princeton.edu>Mudd Manuscript Library</a>.-
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File SizeFormat 
sharma_rohan_Thesis.pdf932.62 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.