Key points are not available for this paper at this time.
Software development involves collaborative interactions where stakeholders express opinions across various platforms. Recognizing the sentiments conveyed in these interactions is crucial for the effective development and ongoing maintenance of software systems. For software products, analyzing the sentiment of user feedback, e.g., reviews, comments, and forum posts can provide valuable insights into user satisfaction and areas for improvement. This can guide the development of future updates and features. However, accurately identifying sentiments in software engineering datasets remains challenging. This study investigates bigger large language models (bLLMs) in addressing the labeled data shortage that hampers fine-tuned smaller large language models (sLLMs) in software engineering tasks. We conduct a comprehensive empirical study using five established datasets to assess three open-source bLLMs in zero-shot and few-shot scenarios. Additionally, we compare them with fine-tuned sLLMs, using sLLMs to learn contextual embeddings of text from software platforms. Our experimental findings demonstrate that bLLMs exhibit state-of-the-art performance on datasets marked by limited training data and imbalanced distributions. bLLMs can also achieve excellent performance under a zero-shot setting. However, when ample training data is available or the dataset exhibits a more balanced distribution, fine-tuned sLLMs can still achieve superior results.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ting Zhang
Ivana Clairine Irsan
Ferdian Thung
ACM Transactions on Software Engineering and Methodology
Singapore Management University
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Tue,) studied this question.
www.synapsesocial.com/papers/68e57799b6db64358751798d — DOI: https://doi.org/10.1145/3697009