Standard

Transformer encoders incorporating word translation for russian-vietnamese machine translation. / Nguyen, Thien; Nguyen, Trang; Nguyen, Huu и др.

в: ICIC Express Letters, Part B: Applications, Том 12, № 1, 01.2021, стр. 35-42.

Результаты исследований: Научные публикации в периодических изданияхстатьяРецензирование

Harvard

Nguyen, T, Nguyen, T, Nguyen, H & Tran, P 2021, 'Transformer encoders incorporating word translation for russian-vietnamese machine translation', ICIC Express Letters, Part B: Applications, Том. 12, № 1, стр. 35-42. https://doi.org/10.24507/icicelb.12.01.35

APA

Nguyen, T., Nguyen, T., Nguyen, H., & Tran, P. (2021). Transformer encoders incorporating word translation for russian-vietnamese machine translation. ICIC Express Letters, Part B: Applications, 12(1), 35-42. https://doi.org/10.24507/icicelb.12.01.35

Vancouver

Nguyen T, Nguyen T, Nguyen H, Tran P. Transformer encoders incorporating word translation for russian-vietnamese machine translation. ICIC Express Letters, Part B: Applications. 2021 янв.;12(1):35-42. doi: 10.24507/icicelb.12.01.35

Author

Nguyen, Thien ; Nguyen, Trang ; Nguyen, Huu и др. / Transformer encoders incorporating word translation for russian-vietnamese machine translation. в: ICIC Express Letters, Part B: Applications. 2021 ; Том 12, № 1. стр. 35-42.

BibTeX

@article{7f4e9e0bdd9c4369af2f3e7c003e6952,
title = "Transformer encoders incorporating word translation for russian-vietnamese machine translation",
abstract = "Neural machine translation systems including the latest Transformer models represent translation units in the form of embeddings – vectors of real numbers. Such continuous representations of translation units lead to smoother translation results, but do not always guarantee better results due to wrong word translations, compared to statistical machine translation systems. Moreover, for low-resource language pairs, such as Russian-Vietnamese, the errors of word translations in neural machine translation systems are more aggravated. In order to solve the problem, we try different ways of con-catenating source word embeddings with embeddings of their corresponding word transla-tions, when building a Transformer-based translation system for the Russian-Vietnamese language pair. As a result, we create two novel Transformer models: Transformer with Long Encoder and Transformer with Short Encoder. In the Transformer with Long Encoder source word embedding and translation embedding of single size are concatenated to form a vector of double size. The Long Encoder reduces the size of the concatenated embedding to single size with a linear layer, and then adds it with positional embedding of the source word to create a final embedding. The Short Encoder resembles the Long Encoder except for the linear layer. Instead, the Short Encoder creates word embedding and translation embedding of half-size, and then concatenates them to form a concatenated embedding of single size. The experimental results show that the proposed models provide better translation quality compared to the baseline Transformer model.",
keywords = "Neural machine transla-tion, Neural networks, Russian-Vietnamese machine translation, Transformer, Word translation",
author = "Thien Nguyen and Trang Nguyen and Huu Nguyen and Phuoc Tran",
note = "Publisher Copyright: {\textcopyright} 2020 Global Research Online. All rights reserved.",
year = "2021",
month = jan,
doi = "10.24507/icicelb.12.01.35",
language = "English",
volume = "12",
pages = "35--42",
journal = "ICIC Express Letters, Part B: Applications",
issn = "2185-2766",
publisher = "IEEE Industrial Electronics Society",
number = "1",

}

RIS

TY - JOUR

T1 - Transformer encoders incorporating word translation for russian-vietnamese machine translation

AU - Nguyen, Thien

AU - Nguyen, Trang

AU - Nguyen, Huu

AU - Tran, Phuoc

N1 - Publisher Copyright: © 2020 Global Research Online. All rights reserved.

PY - 2021/1

Y1 - 2021/1

N2 - Neural machine translation systems including the latest Transformer models represent translation units in the form of embeddings – vectors of real numbers. Such continuous representations of translation units lead to smoother translation results, but do not always guarantee better results due to wrong word translations, compared to statistical machine translation systems. Moreover, for low-resource language pairs, such as Russian-Vietnamese, the errors of word translations in neural machine translation systems are more aggravated. In order to solve the problem, we try different ways of con-catenating source word embeddings with embeddings of their corresponding word transla-tions, when building a Transformer-based translation system for the Russian-Vietnamese language pair. As a result, we create two novel Transformer models: Transformer with Long Encoder and Transformer with Short Encoder. In the Transformer with Long Encoder source word embedding and translation embedding of single size are concatenated to form a vector of double size. The Long Encoder reduces the size of the concatenated embedding to single size with a linear layer, and then adds it with positional embedding of the source word to create a final embedding. The Short Encoder resembles the Long Encoder except for the linear layer. Instead, the Short Encoder creates word embedding and translation embedding of half-size, and then concatenates them to form a concatenated embedding of single size. The experimental results show that the proposed models provide better translation quality compared to the baseline Transformer model.

AB - Neural machine translation systems including the latest Transformer models represent translation units in the form of embeddings – vectors of real numbers. Such continuous representations of translation units lead to smoother translation results, but do not always guarantee better results due to wrong word translations, compared to statistical machine translation systems. Moreover, for low-resource language pairs, such as Russian-Vietnamese, the errors of word translations in neural machine translation systems are more aggravated. In order to solve the problem, we try different ways of con-catenating source word embeddings with embeddings of their corresponding word transla-tions, when building a Transformer-based translation system for the Russian-Vietnamese language pair. As a result, we create two novel Transformer models: Transformer with Long Encoder and Transformer with Short Encoder. In the Transformer with Long Encoder source word embedding and translation embedding of single size are concatenated to form a vector of double size. The Long Encoder reduces the size of the concatenated embedding to single size with a linear layer, and then adds it with positional embedding of the source word to create a final embedding. The Short Encoder resembles the Long Encoder except for the linear layer. Instead, the Short Encoder creates word embedding and translation embedding of half-size, and then concatenates them to form a concatenated embedding of single size. The experimental results show that the proposed models provide better translation quality compared to the baseline Transformer model.

KW - Neural machine transla-tion

KW - Neural networks

KW - Russian-Vietnamese machine translation

KW - Transformer

KW - Word translation

UR - http://www.scopus.com/inward/record.url?scp=85098622833&partnerID=8YFLogxK

UR - https://elibrary.ru/item.asp?id=45040296

U2 - 10.24507/icicelb.12.01.35

DO - 10.24507/icicelb.12.01.35

M3 - Article

AN - SCOPUS:85098622833

VL - 12

SP - 35

EP - 42

JO - ICIC Express Letters, Part B: Applications

JF - ICIC Express Letters, Part B: Applications

SN - 2185-2766

IS - 1

ER -

ID: 34278256