Learning the Schematic Structure of a World: Contextual Understanding of Stochastically Generated Stories in Neural Networks

Chen, Cathy

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp012f75rb767

Title:	Learning the Schematic Structure of a World: Contextual Understanding of Stochastically Generated Stories in Neural Networks
Authors:	Chen, Cathy
Advisors:	Norman, Kenneth
Department:	Computer Science
Class Year:	2018
Abstract:	In psychology literature, schema theory posits that we build a general representation of the world over time, a representation that constitutes a general frame which we can fill with new information in specific situations. This requires us both to learn the contextual information stored in schemata and to apply these frames to new situations. Artificial neural networks have been shown to solve many problems pertaining to finding and recognizing patterns, such as recognizing images, playing games, and answering questions about structured data. In this work, we study whether they can also learn and apply contextual information. We use stochastically generated stories as a test of these abilities, and investigate networks' performance on two types of contextual learning: next state prediction (learning a series of state transitions, which may depend on characters in the story) and role-filler binding (learning to identify the character that fills a specified role). We test architectures with different classes of memory and find that networks are unable to generalize to new fillers if the networks are exposed to a finite set of fillers during training. However, by continually introducing new fillers during training, we succeed in training networks to generalize to previously unseen fillers. We find qualitative differences in learning between networks with different classes of memory, finding distinctions in what networks are able to learn, how networks make mistakes, and how robust networks are to slight story modifications. We also find influences of task difficulty on the order of learning, and influences of training curricula on the success of learning. Networks that learn to perform multiple tasks display a step-wise learning curve, naturally learning easier tasks first before proceeding to more difficult tasks. A curriculum that presents a training regime consisting of a single task before introducing additional tasks (rather than presenting both tasks during the entire span of training) allows networks, that previously could learn only one task, to learn both tasks.
URI:	http://arks.princeton.edu/ark:/88435/dsp012f75rb767
Type of Material:	Princeton University Senior Theses
Language:	en
Appears in Collections:	Computer Science, 1988-2020

Files in This Item:

File	Description	Size	Format
CHEN-CATHY-THESIS.pdf		1.08 MB	Adobe PDF	Request a copy

Show full item record

Search

Browse