Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp018s45qc82s
Title: | Automated Trainable Data Clustering With Applications in Astronomy |
Authors: | Minns, Charlie |
Advisors: | Melchior, Peter |
Department: | Physics |
Certificate Program: | Applications of Computing Program |
Class Year: | 2020 |
Abstract: | One of the most commonly used techniques in data science is clustering: dividing a dataset into a certain number of groups so that the points in each group have similar properties. Different methods can be used to cluster data more efficiently and accurately, and one such method is Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). In this thesis, I discuss my research on the applications of this method in astronomy and remote sensing. Using this algorithm, I produced results from hyperspectral images of the Nili Fossae region on Mars which are consistent with existing literature. This demonstrates HDBSCAN's ability to produce reliable clustering results. I also investigated different metrics used to cluster data in extragalactic surveys and measured the clustering success of HDBSCAN using training datasets. By comparing the success of different results, I found that it is possible to tune the input parameters to improve upon the clustering result. This thesis demonstrates the current capabilities of HDBSCAN and explores ways in which this algorithm can be improved to make the clustering process more autonomous. |
URI: | http://arks.princeton.edu/ark:/88435/dsp018s45qc82s |
Type of Material: | Princeton University Senior Theses |
Language: | en |
Appears in Collections: | Physics, 1936-2020 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
MINNS-CHARLIE-THESIS.pdf | 1.12 MB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.