Los puntos clave no están disponibles para este artículo en este momento.
In communication, technology has been played a significant role in many ways, and it is an essential part for human life nowadays. The majority of people commonly speak two or more languages for making better communication in the regional level or worldwide. Code-mixing is a practice of mixing words from different languages in multilingual settings. In addition, there is a growing demand for code-mixed sentiment analysis of comments posted by users on social media. Systems are trained for data available in one language only and failed with the data in multiple languages, because of the complexity of mixed data at different levels. However, there are only very few code-mixed data are available to create a model. There are no resources available for Sinhala-English code-mixed language, and it is important for researchers to give attention on sentiment analysis using Sinhala-English mixed language. We present a sentiment-labeled corpus for sentiment analysis of code-mixed Sinhala-English text using comments from You Tube® videos. An annotation setup is used to label and create a Sinhala-English dataset for sentiment analysis and the comments are pre-processed to clean. The entire data set has been divided into three groups: neutral, negative, and positive. In order to demonstrate the insight of the dataset, this study employs five machine learning algorithms on a newly created Sinhala-English dataset and achieved significant accuracy.
Uthpala et al. (Wed,) studied this question.