Multilingual online forums allow people from different backgrounds and around the world to express themselves freely. Large-scale opinion mining is used in digital policymaking, market research, social media monitoring, and more. Online discussions are multilingual and dynamic, making representation, categorization, and cross-lingual alignment challenging. Existing supervised deep learning approaches for opinion mining require large, labeled datasets, which can be challenging to obtain for emerging domains and low-resource languages. Traditional models cannot generalize across languages and data formats because they do not account for user, posting, and subject linkages. To address these issues, we offer SOGM (Semi-supervised Opinion Graph Model), a novel large-scale opinion mining method that blends GNNs with multilingual embeddings. SOGM creates a diversified graph using users, posts, and linguistic attributes as nodes and semantic, temporal, and interaction-based relations as edges. SOGM utilizes both labeled and unlabeled data for opinion classification, resulting in lower annotation costs and facilitating cross-lingual knowledge transfer. On large multilingual forum datasets, SOGM outperforms baseline models in cross-lingual generalization accuracy by 15% and macro-F1 by up to 12%. Scalable, the model performs well even with millions of postings. Finally, SOGM demonstrates that semi-supervised GNNs can extract opinions across languages, enabling scalable, cross-lingual sentiment analysis in various web environments.
Mishra et al. (Thu,) studied this question.