August 15, 2025Open Access

Prompt Optimization with Two Gradients for Classification in Large Language Models

Key Points

PO2G achieved almost 89% accuracy after just three iterations, outperforming ProTeGi's six iterations for similar results.
The evaluation covered nine NLP tasks, including three original ProTeGi tasks and six different tasks.
Automation of prompt optimization reduces human effort and increases efficiency for NLP classification tasks.
Efficient frameworks like PO2G offer valuable insights for improving classification across various NLP applications.

Abstract

Large language models (LLMs) generally perform well in common tasks, yet are often susceptible to errors in sophisticated natural language processing (NLP) on classification applications. Prompt engineering has emerged as a strategy to enhance their performance. Despite the effort required for manual prompt optimization, recent advancements highlight the need for automation to reduce human involvement. We introduced the PO2G (prompt optimization with two gradients) framework to improve the efficiency of optimizing prompts for classification tasks. PO2G demonstrates improvement in efficiency, reaching almost 89% accuracy after just three iterations, whereas ProTeGi requires six iterations to achieve a comparable level. We evaluated PO2G and ProTeGi on a benchmark of nine NLP tasks, three tasks from the original ProTeGi study, and six non-domain-specific tasks. We also evaluated both frameworks on seven legal-domain classification tasks. These results provide broader insights into the efficiency and effectiveness of prompt optimization frameworks for classification across diverse NLP scenarios.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper