Large language models (LLMs) generally perform well in common tasks, yet are often susceptible to errors in sophisticated natural language processing (NLP) on classification applications. Prompt engineering has emerged as a strategy to enhance their performance. Despite the effort required for manual prompt optimization, recent advancements highlight the need for automation to reduce human involvement. We introduced the PO2G (prompt optimization with two gradients) framework to improve the efficiency of optimizing prompts for classification tasks. PO2G demonstrates improvement in efficiency, reaching almost 89% accuracy after just three iterations, whereas ProTeGi requires six iterations to achieve a comparable level. We evaluated PO2G and ProTeGi on a benchmark of nine NLP tasks, three tasks from the original ProTeGi study, and six non-domain-specific tasks. We also evaluated both frameworks on seven legal-domain classification tasks. These results provide broader insights into the efficiency and effectiveness of prompt optimization frameworks for classification across diverse NLP scenarios.
Building similarity graph...
Analyzing shared references across papers
Loading...
Anthony Jethro Lieander
Hui Wang
Karen Rafferty
AI
Queen's University Belfast
Building similarity graph...
Analyzing shared references across papers
Loading...
Lieander et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68a3656a0a429f797332ba48 — DOI: https://doi.org/10.3390/ai6080182