Strengthening Low-resource Neural Machine Translation through Joint Learning: The Case of Farsi-Spanish [Strengthening Low-resource Neural Machine Translation through Joint Learning: The Case of Farsi-Spanish]

Ahmadnia, Benyamin; Aranovich, Raul; Dorr, Bonnie

doi:10.5220/0010362604750481

Citation Details

Strengthening Low-resource Neural Machine Translation through Joint Learning: The Case of Farsi-Spanish [Strengthening Low-resource Neural Machine Translation through Joint Learning: The Case of Farsi-Spanish]

This paper describes a systematic study of an approach to Farsi-Spanish low-resource Neural Machine Translation (NMT) that leverages monolingual data for joint learning of forward and backward translation models. As is standard for NMT systems, the training process begins using two pre-trained translation models that are iteratively updated by decreasing translation costs. In each iteration, either translation model is used to translate monolingual texts from one language to another, to generate synthetic datasets for the other translation model. Two new translation models are then learned from bilingual data along with the synthetic texts. The key distinguishing feature between our approach and standard NMT is an iterative learning process that improves the performance of both translation models, simultaneously producing a higher-quality synthetic training dataset upon each iteration. Our empirical results demonstrate that this approach outperforms baselines. more »

Award ID(s):: 1840191

NSF-PAR ID:: 10298444

Author(s) / Creator(s):: Ahmadnia, Benyamin; Aranovich, Raul; Dorr, Bonnie

Date Published:: 2021-01-01

Journal Name:: Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI

Volume:: 1

Page Range / eLocation ID:: 475 to 481

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.5220/0010362604750481

More Like this