August 14, 2025Open Access

A Context-Aware Embedding Approach to Meaning Conflation Deficiency in Sesotho sa Leboa: Addressing Semantic Ambiguity

Key Points

Context-aware models improve precision and recall in resolving meaning conflation deficiency in Sesotho sa Leboa.
ELMo achieved the highest F1-score of 93%, outperforming static embeddings in semantic tasks.
Deep contextual models like ELMo and GPT-2 excelled in accuracy and interpretability based on clustering metrics.
The paper sets new standards for evaluating semantic disambiguation in under-represented languages.

Abstract

A major problem in Natural Language Processing (NLP) is Meaning Conflation Deficiency (MCD), especially in low-resource, morphologically rich languages like Sesotho sa Leboa. In downstream tasks like Word Sense Disambiguation (WSD), traditional word embeddings frequently perform poorly because they are unable to distinguish between a word's numerous senses. To ascertain how well various context-aware and multi-prototype word embedding models—such as ELMo, GPT-2, BERT, Universal Sentence Encoder, and hybrid versions of Doc2Vec and SBERT—resolve MCD, this study examines and assesses them. Standard classification measures (precision, recall, F1-score, and accuracy) as well as clustering-based metrics and visualisation approaches were used to assess the models after they were trained and tested on a sense-annotated Sesotho sa Leboa corpus. According to the results, deep contextual models—in particular, ELMo and GPT-2—perform noticeably better in terms of accuracy and sense separation than static and unsupervised models. With well-separated confusion matrices, ELMo showed excellent interpretability and the highest F1-score (93%) of any model. According to the results, context-aware architecture provides reliable MCD solutions as well as a scalable framework for improving WSD in language applications with limited resources. For future studies on semantic disambiguation in under-represented languages, the work offers fresh standards and perspectives.

A Context-Aware Embedding Approach to Meaning Conflation Deficiency in Sesotho sa Leboa: Addressing Semantic Ambiguity

Key Points

Abstract

Cite This Study