Sequence to Sequence Learning: Fast Training and Inference with Gated Convolutions

Date

Thu March 8th 2018, 11:00am

Location

Gates Building, Room 219

Michael Auli

Facebook AI Research

Neural architectures for machine translation and language modeling are currently a very active research field. The first part of this talk introduces several architectural changes to the original work of Bahdanau et al. 2014. We replace non-linearities with our novel gated linear units, recurrent units with convolutions and we introduce multi-hop attention. These changes improve generalization performance, training efficiency and decoding speed. The second part of the talk analyzes the properties of the distribution predicted by the model and how this influences search.