In multilingual machine translation systems, word sense disambiguation (WSD) plays a critical role in ensuring contextual accuracy, especially when translating between morphologically rich languages like Bengali and English. This paper focuses on the computation of machine translation errors arising from incorrect sense selection of polysemous Bengali words and proposes a hybrid approach to minimize such errors through implicit consideration of WSD. We explore and compare multiple word embedding techniques, including Bag of Words, Term Frequency-Inverse Document Frequency, Word2Vec, and FastText, ultimately identifying FastText as the most effective due to its subword-level handling of Bengali morphology. Furthermore, we incorporate a custom label of exact sense mappings to preserve the correct translation of named entities and culturally significant terms such as mythological names like “কর্ণ” (Karṇa) being translated accurately as “Karna” rather than “a character of Mahabharata”. We implemented a hybrid approach consisting of FastText, bidirectional long short-term memory, and attention mechanisms. To evaluate the effectiveness of our approach, some machine translation evaluation metrices are measured. Our proposed methodology is compared with existing WSD algorithms. It improves the disambiguation issue by leveraging contextual embeddings and exact sense matching that leads to better translation outcomes. This work highlights the need for sense-aware translation models and presents a robust, hybrid strategy for mitigating WSD-related translation errors in Bengali-to-English machine translation systems.
Seal et al. (Fri,) studied this question.