One-shot Learning of Word Meaning with Distributional Models

Date

Thu May 11th 2017, 4:30pm

Location

Margaret Jacks Hall, Greenberg Room (460-126)

Katrin Erk

University of Texas at Austin

When humans encounter an unknown word in text, they can often infer approximately what it means, sometimes even after a single occurrence. Distributional models can induce word meaning representations from text, but typically need hundreds of instances of a word. But can they really not learn anything useful from less data? We tested probabilistic distributional models on their ability to do one-shot learning of definitional properties ("alligators are dangerous", "alligators are animals") from text only. We found that first learning overarching structure in the known data, regularities in textual contexts and in properties, helps one-shot learning, and that individual context items can be highly informative. I link this work up to a larger question of how we can describe learning from textual data in a probabilistic semantics setting.