The rapid growth of social media and internet usage creates a vast amount of textual information, making sentiment analysis important for understanding the interpretation of people. Most sequence-based models capture long-range dependencies but have high latency, whereas transformer-based models are more accurate with higher computational complexity. To address these challenges, a hybrid RoBERTa-BiGRU model with an attention mechanism is proposed in this work. This hybrid model is a combination of the contextual learning capability of the transformer with the sequential modeling efficiency of Bi-GRU, with an attention mechanism to identify the relevant input to make predictions. This approach was evaluated on the IMDb dataset, the model has the accuracy of 96.02%, precision of 96.20%, recall of, 95.80% and F1-score of, 96.00% which is better than the conventional deep learning models. To provide efficient deployment, the trained model was converted to ONNX and optimized with OpenVINO in FP32 and FP16, and using the Neural Network Compression Framework (NNCF) further, we optimized it into INT8. The results showed that the optimized model demonstrated between 30 − 35% reduction in latency, significantly improved throughput, and inference can be 10 − 15x faster than the baseline PyTorch version, with no significant decrease in accuracy. These results highlight that the use of a hybrid combination with an optimization technique is an effective and practical approach for real-time sentiment analysis applications.
Praveen et al. (Thu,) studied this question.