The representational geometry of word meanings acquired by neural machine translation models

Felix Hill; Kyunghyun Cho; Sébastien Jean; Yoshua Bengio

doi:10.1007/s10590-017-9194-2

The representational geometry of word meanings acquired by neural machine translation models

Felix Hill, Kyunghyun Cho, Sébastien Jean, Yoshua Bengio

Source

Machine Translation > 2017 > 31 > 1-2 > 3-18

Abstract

This work is the first comprehensive analysis of the properties of word embeddings learned by neural machine translation (NMT) models trained on bilingual texts. We show the word representations of NMT models outperform those learned from monolingual text by established algorithms such as Skipgram and CBOW on tasks that require knowledge of semantic similarity and/or lexical–syntactic role. These effects hold when translating from English to French and English to German, and we argue that the desirable properties of NMT word embeddings should emerge largely independently of the source and target languages. Further, we apply a recently-proposed heuristic method for training NMT models with very large vocabularies, and show that this vocabulary expansion method results in minimal degradation of embedding quality. This allows us to make a large vocabulary of NMT embeddings available for future research and applications. Overall, our analyses indicate that NMT embeddings should be used in applications that require word concepts to be organised according to similarity and/or lexical function, while monolingual embeddings are better suited to modelling (nonspecific) inter-word relatedness.