What type of study is this?

September 10, 2025

Optimizing BERT Models with Fine-Tuning for Indonesian Twitter Sentiment Analysis

Key Points

Fine-tuned BERT model achieves accuracy of 91%, outperforming the baseline model.
Experimental results demonstrate improvements in precision, recall, and F1-score with fine-tuning.
Data preprocessing steps included case folding, cleaning, tokenisation, and data augmentation.
Study emphasizes the potential of fine-tuning BERT in low-resource language sentiment analysis.

Abstract

Twitter has emerged as a critical platform for capturing public sentiment, offering a valuable source for sentiment analysis. This study presents a comparative evaluation of two BERT (Bidirectional Encoder Representations from Transformers) models—baseline and fine-tuned—targeted at analyzing Indonesian-language tweets. Employing the CRISP-DM framework, the methodology encompasses automated data crawling, comprehensive text pre-processing (including case folding, cleaning, tokenisation, normalisation, and data augmentation), and model development using the IndoBERT-base-p1 architecture. The experimental results reveal that the fine-tuned BERT model achieves significantly improved performance over the non-optimized model, with accuracy, precision, recall, and F1-score values reaching 91%, 0.91, 0.90, and 0.91, respectively. These findings indicate the fine-tuned model's superior ability to capture linguistic subtleties and contextual sentiment features within informal social media text. Furthermore, the model is deployed in a web-based application for real-time sentiment classification, demonstrating its practical applicability. This study underscores the effectiveness of Fine-tuning in enhancing BERT-based sentiment analysis for low-resource languages. It highlights its potential for informing decisionmaking in digital communication, marketing, and policy research contexts.

Mark Helpful

Bookmark

Relay