Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01b2773z41k
Title: | Computational Reproducibility and the Fragile Families Challenge: Lessons Learned and Suggestions for the Future |
Authors: | Liu, David |
Advisors: | Salganik, Matthew J |
Department: | Computer Science |
Certificate Program: | Center for Statistics and Machine Learning |
Class Year: | 2018 |
Abstract: | As the availability of social data and reliance on computational methods increases, there is a need to establish guidelines for computational reproducibility in the social sciences. The Fragile Families Challenge presented a unique case study in which interdisciplinary researchers developed social prediction models and then submitted papers for review. Based on our experience reproducing the results as part of a journal review process, we propose a set of guidelines that can improve the reproducibility of open sourced code. These findings suggest that open sourcing data and code is a crucial first step towards computational reproducibility but leaves the replicator with the task of configuring an appropriate computing environment and parsing the code structure. By leveraging virtualization and pipeline design - tools and concepts from software engineering - we develop a set of guidelines that journal editors can adopt. In the case of Fragile Families, these guidelines are shown to be simple enough for adoption yet effective in rendering code more transparent. The rewards of reproducibility are further shown by developing an extension that boosts one of the Challenge's submissions, improving the model's mean squared error. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01b2773z41k |
Type of Material: | Princeton University Senior Theses |
Language: | en |
Appears in Collections: | Computer Science, 1988-2020 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
LIU-DAVID-THESIS.pdf | 859.76 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.