Simple Science

Cutting edge science explained simply

# Computer Science# Computation and Language# Artificial Intelligence

LLMs Outperform Traditional Systems in Translation

Study shows LLMs provide more natural translations, especially for idiomatic phrases.

― 5 min read


LLMs Excel in LanguageLLMs Excel in LanguageTranslationhandling idioms.LLMs outperform traditional models in
Table of Contents

Large language models (LLMs) like GPT-3 are capable of many tasks involving language. One of these tasks is translation, where they can convert text from one language to another. In recent studies, researchers have looked into how well these models perform in translating languages, particularly compared to traditional machine translation systems.

Translation Quality

Machine translation has always been known for producing word-for-word Translations, which may not always make sense in the target language. LLMs, on the other hand, have shown to produce translations that are often more natural or fluent. Researchers have focused on understanding how translations from LLMs differ from those produced by conventional systems.

The Study

In this study, researchers examined how LLMs and traditional translation models handle translations, especially when it comes to idiomatic phrases. Idioms are expressions where the meaning cannot be guessed from the individual words. For example, "kick the bucket" means to die, and doesn't relate literally to kicking or a bucket.

The researchers used various methods to assess translations in terms of how Literal they are. They found that translations produced by LLMs tend to be less literal. This means that LLMs can often capture the intended meaning better than traditional systems, especially when dealing with idioms.

Measuring Literalness

To measure how literal translations are, researchers developed two main methods:

  1. Unaligned Source Words: This method counts how many words in the original text do not have a direct equivalent in the translation. A higher number of unaligned words often indicates a less literal translation.

  2. Non-Monotonicity: This method looks at the order of the words in both the original and translated sentences. If the words do not follow a similar structure, it suggests a less literal translation.

Using these measures, researchers found that translations from LLMs generally have more unaligned words and a higher level of non-monotonicity.

Translating Idioms

One of the main findings from the study is that translations involving idiomatic phrases are where LLMs really shine. Traditional systems often struggle with idioms, providing translations that are too literal and therefore confusing. For instance, translating an idiom directly can lead to a result that sounds absurd in the target language.

In contrast, LLMs can provide translations that convey the correct meaning, even if the words chosen are not a direct match to the original. This ability to handle idioms effectively demonstrates the flexibility that LLMs possess in translating languages.

Human Evaluations

To validate their findings, researchers conducted human evaluations. They presented bilingual speakers with pairs of translations from both LLMs and traditional systems. The speakers were then asked to judge which translation seemed more literal.

The results showed a clear preference for the translations provided by LLMs. Bilingual speakers generally found these translations to be less literal and more natural when compared to those from traditional translation systems.

Implications for Translation

The study highlights a significant advantage of using LLMs for translation tasks, particularly when idiomatic expressions are involved. The ability to produce less literal translations can lead to better comprehension and more fluent conversational exchanges in different languages.

This has important implications for the future of machine translation. As companies and individuals increasingly rely on automated translation, using models that prioritize understanding over literal word-for-word translation could enhance communication across various languages.

Experimenting with Different Languages

The researchers also conducted their experiments with translations in multiple languages, including German, Chinese, and Russian. This diversity helped in understanding how LLMs approach translation in different linguistic contexts.

The findings were consistent across the various languages examined. It was evident that translations from LLMs exhibited less literalness regardless of the language pair involved.

Challenges in Translation Evaluation

One of the challenges in evaluating translation quality is the lack of established metrics specifically designed to measure how literal a translation is. While there are many tools for assessing translation quality, most focus on fluency and adequacy rather than literalness.

The measures developed in this study fill that gap, allowing researchers to better assess how well different systems perform in capturing meaning. This advancement is crucial for further studies and improvements in translation systems.

Conclusion

In summary, LLMs like GPT-3 show great promise in machine translation, particularly in producing translations that are less literal and more naturally flowing. The ability of these models to effectively handle idiomatic phrases presents a significant advantage over traditional systems.

As the field of machine translation continues to evolve, the insights gained from this research provide valuable guidance for future development. The findings encourage further exploration of LLMs and their potential to improve communication in a multilingual world.

The study reinforces the idea that translation is not just about converting words from one language to another. It is about conveying meaning in a way that makes sense to the reader. The difference in how LLMs approach this task is a significant step forward in the pursuit of better translation technologies.

Original Source

Title: Do GPTs Produce Less Literal Translations?

Abstract: Large Language Models (LLMs) such as GPT-3 have emerged as general-purpose language models capable of addressing many natural language generation or understanding tasks. On the task of Machine Translation (MT), multiple works have investigated few-shot prompting mechanisms to elicit better translations from LLMs. However, there has been relatively little investigation on how such translations differ qualitatively from the translations generated by standard Neural Machine Translation (NMT) models. In this work, we investigate these differences in terms of the literalness of translations produced by the two systems. Using literalness measures involving word alignment and monotonicity, we find that translations out of English (E-X) from GPTs tend to be less literal, while exhibiting similar or better scores on MT quality metrics. We demonstrate that this finding is borne out in human evaluations as well. We then show that these differences are especially pronounced when translating sentences that contain idiomatic expressions.

Authors: Vikas Raunak, Arul Menezes, Matt Post, Hany Hassan Awadalla

Last Update: 2023-06-05 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2305.16806

Source PDF: https://arxiv.org/pdf/2305.16806

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles