From Priest to Doctor: Domain Adaptation for Low-Resource Neural Machine Translation

Marashian, Ali; Rice, Enora; Gessler, Luke; Palmer, Alexis; von_der_Wense, Katharina

Citation Details

Many of the world’s languages have insufficient data to train high-performing general neural machine translation (NMT) models, let alone domain-specific models, and often the only available parallel data are small amounts of religious texts. Hence, domain adaptation (DA) is a crucial issue faced by contemporary NMT and has, so far, been underexplored for low-resource languages. In this paper, we evaluate a set of methods from both low-resource NMT and DA in a realistic setting, in which we aim to translate between a high-resource and a low-resource language with access to only: a) parallel Bible data, b) a bilingual dictionary, and c) a monolingual target-domain corpus in the high-resource language. Our results show that the effectiveness of the tested methods varies, with the simplest one, DALI, being most effective. We follow up with a small human evaluation of DALI, which shows that there is still a need for more careful investigation of how to accomplish DA for low-resource NMT. more »

Award ID(s):: 2149404

PAR ID:: 10651188

Author(s) / Creator(s):: Marashian, Ali; Rice, Enora; Gessler, Luke; Palmer, Alexis; von_der_Wense, Katharina

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2025-01-01

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this