Transformer Architecture Search for Improving Out-of-Domain Generalization in Machine Translation

He, Yiheng; Zhang, Ruiyi; Somayajula, Sai Ashish; Xie, Pengtao

Citation Details

This content will become publicly available on December 25, 2025

Transformer Architecture Search for Improving Out-of-Domain Generalization in Machine Translation

Interest in automatically searching for Transformer neural architectures for machine translation (MT) has been increasing. Current methods show promising results in in-domain settings, where training and test data share the same distribution. However, in real-world MT applications, it is common that the test data has a different distribution than the training data. In these out-of-domain (OOD) situations, Transformer architectures optimized for the linguistic characteristics of the training sentences struggle to produce accurate translations for OOD sentences during testing. To tackle this issue, we propose a multi-level optimization based method to automatically search for neural architectures that possess robust OOD generalization capabilities. During the architecture search process, our method automatically synthesizes approximated OOD MT data, which is used to evaluate and improve the architectures' ability of generalizing to OOD scenarios. The generation of approximated OOD data and the search for optimal architectures are executed in an integrated, end-to-end manner. Evaluated across multiple datasets, our method demonstrates strong OOD generalization performance, surpassing state-of-the-art approaches. more »

Award ID(s):: 2339216 2405974

PAR ID:: 10618499

Author(s) / Creator(s):: He, Yiheng; Zhang, Ruiyi; Somayajula, Sai Ashish; Xie, Pengtao

Publisher / Repository:: Transactions on Machine Learning Research

Date Published:: 2024-12-25

Journal Name:: Transactions on machine learning research

ISSN:: 2835-8856

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on December 25, 2025
Journal Article:
The DOI is not currently available.

More Like this