What question did this study set out to answer?

The study aims to compute machine translation errors related to incorrect sense selection in Bengali due to polysemy.

March 2, 2026Open Access

Computation of Machine Translation Errors Through Implicit Consideration of Word Sense Disambiguation in Bengali Text

Key Points

The study aims to compute machine translation errors related to incorrect sense selection in Bengali due to polysemy.
Compared multiple word embedding techniques, focusing on FastText for Bengali morphology.
Implemented a hybrid model combining FastText, bidirectional long short-term memory, and attention mechanisms.
Incorporated a custom label for exact sense mappings of named entities.
Identified FastText as the most effective method for handling Bengali morphology.
Improved translation outcomes through better disambiguation using contextual embeddings.
Demonstrated a clear enhancement over existing word sense disambiguation algorithms.

Abstract

In multilingual machine translation systems, word sense disambiguation (WSD) plays a critical role in ensuring contextual accuracy, especially when translating between morphologically rich languages like Bengali and English. This paper focuses on the computation of machine translation errors arising from incorrect sense selection of polysemous Bengali words and proposes a hybrid approach to minimize such errors through implicit consideration of WSD. We explore and compare multiple word embedding techniques, including Bag of Words, Term Frequency-Inverse Document Frequency, Word2Vec, and FastText, ultimately identifying FastText as the most effective due to its subword-level handling of Bengali morphology. Furthermore, we incorporate a custom label of exact sense mappings to preserve the correct translation of named entities and culturally significant terms such as mythological names like “কর্ণ” (Karṇa) being translated accurately as “Karna” rather than “a character of Mahabharata”. We implemented a hybrid approach consisting of FastText, bidirectional long short-term memory, and attention mechanisms. To evaluate the effectiveness of our approach, some machine translation evaluation metrices are measured. Our proposed methodology is compared with existing WSD algorithms. It improves the disambiguation issue by leveraging contextual embeddings and exact sense matching that leads to better translation outcomes. This work highlights the need for sense-aware translation models and presents a robust, hybrid strategy for mitigating WSD-related translation errors in Bengali-to-English machine translation systems.

Bookmark

View Full Paper

Bookmark

View Full Paper

Computation of Machine Translation Errors Through Implicit Consideration of Word Sense Disambiguation in Bengali Text

Key Points

Abstract

Cite This Study