Download PDFOpen PDF in browserWord-Alignment Emphasized Dependency Aware Decoder (WDAD) with Data Augmentation in Nonautoregressive TranslationEasyChair Preprint 1199054 pages•Date: February 8, 2024AbstractThe Non-Auto-Regressive model (NAT) for machine translation offers increased effi ciency compared to autoregressive models but faces challenges related to target-side dependencies. Two issues arise: over and under-translation; and a multi-modal problem of natural language. To mitigate these problems, previous researchers have made extensive efforts, particularly with the Dependency Awareness Decoder (DAD) model. While these models focus on retaining target-side dependencies to enhance performance to some extent, they still leave two gaps in cross-lingual translation tasks: word embeddings in shared embedding space and shared character sequences. This paper proposes two solutions to address these issues, namely adaptation from the Ernie-M model and data augmentation involving language BPE(LBPE), respec tively. Additionally, the paper explores their combined effect, enabling language prompts to help the model distinguish tokens from different languages and cluster words from a semantic perspective. Thus, the Word-alignment Language-Prompted DAD (WDAD) model with data augmentation is proposed, which indeed demon strates progress. Combination model of LBPE and CAMLM contributes approximately +0.5 BLEU score points on the WMT14 De-En pair dataset, and CAMLM contributes approxi mately +1 BLEU score points on the WMT16 En-Ro dataset, while the combined model exhibits limitations in its interaction with the combined work due to the inap propriate data augmentation strategy of LBPE, as evidenced by a mixed data strategy and language embedding layer, and the baseline data augmentation strategy. But this does not deny the principle of LBPE and any effects LBPE made at all. It is just a sign that there are better solutions for data augmentation strategy. Keyphrases: NAT, XML, word alignment
|