Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp015h73q0108
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Singh, Mona | |
dc.contributor.author | Todd, David | |
dc.date.accessioned | 2020-10-01T21:26:23Z | - |
dc.date.available | 2020-10-01T21:26:23Z | - |
dc.date.created | 2020-05-03 | |
dc.date.issued | 2020-10-01 | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/dsp015h73q0108 | - |
dc.description.abstract | Characterizing proteins, which mediate a wide array of cellular processes by bind-ing various ligands, is a major aim of computational biology. While proteins maycontain hundreds of amino acids, often only a few are typically involved in inter-actions with biologically relevant ligands. The most direct approach to determinewhich amino acid residues within a protein are involved in binding is through ex-perimental methods, but only relatively few proteins have been captured in com-plex with a relevant ligand. To bridge this gap, we train a bidirectional Long ShortTerm Memory (BiLSTM) model to predict the binding properties of each aminoacid position from sequence-based features for five ligand groups: DNA, RNA,protein, ion, and metabolite. To increase power, we extend our set of true labels be-yond the limited experimental data by using protein domain-based inferred bind-ing scores. We then evaluate our model by measuring performance on a held-outtest set, and compare performance to a baseline XGBoost model, as well as an ex-isting method. In both these comparisons, our model performs at least as well orbetter for all ligand groups. Because they reflect the binding potential of individ-ual amino acid sites, our predictions can also provide insight into both healthy anddiseased protein function. | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dc.title | Identifying Functional Protein Positions Using Neural Networks | |
dc.type | Princeton University Senior Theses | |
pu.date.classyear | 2020 | |
pu.department | Computer Science | |
pu.pdf.coverpage | SeniorThesisCoverPage | |
pu.contributor.authorid | 961272339 | |
Appears in Collections: | Computer Science, 1988-2020 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
TODD-DAVID-THESIS.pdf | 885.86 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.