Neural Machine Translation with an Awareness of Semantic Similarity

Journal: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Citation: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol.14326 LNAI, pp.223-235

Keyword: Fusion Mechanism Multi-branch Attention Sentence Semantic-aware Transformer

Mesh Keyword: Cross entropy Entropy loss Fusion mechanism Machine translation models Machine translations Multi-branch attention Semantic similarity Semantic-aware Sentence semantic-aware Transformer

All Science Classification Codes (ASJC): Theoretical Computer Science Computer Science (all)

Abstract: Machine translation requires that source and target sentences have identical semantics. Previous neural machine translation (NMT) models have implicitly achieved this requirement using cross-entropy loss. In this paper, we propose a sentence Semantic-aware Machine Translation model (SaMT) which explicitly addresses the issue of semantic similarity between sentences in translation. SaMT integrates a Sentence-Transformer into a Transformer-based encoder-decoder to estimate semantic similarity between source and target sentences. Our model enables translated sentences to maintain the semantics of source sentences, either by using the Sentence-Transformer alone or by including an additional linear layer in the decoder. To achieve high-quality translation, we employ vertical and horizontal feature fusion methods, which capture rich features from sentences during translation. Experimental results showed a BLEU score of 36.41 on the IWSLT2014 German→ English dataset, validating the efficacy of incorporating sentence-level semantic knowledge and using the two orthogonal fusion methods. Our code is available at https://github.com/aaa559/SaMT-master.

URI: https://aurora.ajou.ac.kr/handle/2018.oak/37110
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85177459829&origin=inward

Funding: This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2021-0-02051) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation).

qrcode