Sequence to Sequence Learning: Fast Training and Inference with Gated Convolutions

Date
Thu March 8th 2018, 11:00am
Location
Gates Building, Room 219
Michael Auli
Facebook AI Research

 

Neural architectures for machine translation and language modeling are currently a very active research field. The first part of this talk introduces several architectural changes to the original work of Bahdanau et al. 2014. We replace non-linearities with our novel gated linear units, recurrent units with convolutions and we introduce multi-hop attention. These changes improve generalization performance, training efficiency and decoding speed. The second part of the talk analyzes the properties of the distribution predicted by the model and how this influences search.