July 6, 2023

Mel-Frequency Cepstral Coefficients-Based Emotion Identification Using Artificial Neural Network From Speech And Songs

IDIshan DhakalJain University NSNavin Kumar SahNepal Medical College Teaching Hospital KLKwok Wei LingJain University

Key Points

Key points are not available for this paper at this time.

Abstract

This paper compares the performance of three popular machine learning algorithms for speech emotion recognition - Multi-Layer Perceptron (MLP), Decision Tree, and Convolutional Neural Network (CNN). The MLP model achieved competitive accuracy of 0.83 while being computationally efficient and easy to train. The Decision Tree algorithm, which is a popular technique for categorization tasks, achieved an accuracy score of 0.65. The CNN model exhibited superior performance compared to the MLP and Decision Tree models, achieving an accuracy score of 0.86 and optimal performance across all emotion classes. The study also found that certain emotions are more difficult to discern than others. While the research highlights the potential of CNN models in speech and audio recognition, there is still scope for improvement, particularly in recognizing challenging emotions. Overall, this paper provides significant insights into the efficiency of several algorithms in speech emotion recognition and recommends future research directions.

AIに質問

Bookmark

View Full Paper