What question did this study set out to answer?

The research aims to enhance legal reasoning tasks through expert-driven prompt engineering for language models. It explores the effectiveness of various prompt elements on model performance.

March 6, 2026Open Access

Investigating Expert-Based Prompt Engineering for Legal Entailment Tasks

Key Points

The research aims to enhance legal reasoning tasks through expert-driven prompt engineering for language models. It explores the effectiveness of various prompt elements on model performance.
Designed five prompt elements based on legal expertise.
Used 1-shot prompting and dictionary definitions to guide models.
Implemented knowledge representations of legal articles and IRAC-style prompting.
Evaluated model performance across different legal entailment tasks.
Certain prompt elements significantly improved model performance based on context.
For smaller models, more prompt elements generally led to better performance.
A specific combination of prompt elements worked best for each model and sub-task.
The advanced reasoning model showed increased performance using selected prompt elements across all tasks.

Abstract

Abstract Legal reasoning is complex and multi-faceted, requiring a broad set of skills. By employing domain knowledge from legal experts, we design five elements that can be included in prompts for large language models that could aid in legal reasoning tasks. We use additional legal guidelines, 1-shot prompting, dictionary definitions, knowledge representations of legal articles, and IRAC-style prompting. We investigate the effect of each prompt element on the model’s performance on a legal entailment task. Certain prompt elements can improve performance, depending on the context and the model. For the smaller models, increasing the number of prompt elements improves performance on average. For any particular combination of model and sub-task, only using a subset of the prompt elements seems to work best. For the most advanced reasoning model we evaluate, using a selection of prompt elements increases average performance across all evaluated sub-tasks. Results indicate that the problem space of the legal entailment task may be too large for a single model and prompt. In future research, we therefore aim to investigate the capabilities of an ensemble of specialized models.

Bookmark

View Full Paper

Bookmark

View Full Paper

Investigating Expert-Based Prompt Engineering for Legal Entailment Tasks

Key Points

Abstract

Cite This Study