The integration of artificial intelligence (AI) and large language models (LLMs) has significantly influenced numerous industries. However, the legal sector remains cautious, given its stringent demands for confidentiality, accuracy, and adaptability. Over the summer, I conducted research examining the intersection of law and LLMs, focusing on their current limitations and potential advancements. My analysis revealed that even the cutting-edge models (e.g., ChatGPT, Claude) exhibit notable deficiencies in legal reasoning. Specifically, they struggle to: Accurately identify legal issues, Retrieve relevant case law reliably, Apply legal principles to the facts with precision, and Provide accurate pinpoint legal citations. To address these gaps, I investigated how lawyers analyze legal issues and craft responses in the form of legal opinions. This research informed the development of a structured system prompt designed to enhance OpenJustice, Queen’s open-source legal AI. Within OpenJustice, I assisted with the research on the Code for Dialogue (CoDial) framework and its dialogue flow system. My work involved analyzing how lawyers approach complex legal issues, how lawyers deconstruct legal principles into smaller components, and translating these principles into a structured graph for AI interpretation. In collaboration with the engineering team, I provided support for the implementation of functionalities that enable dialogue flows to resemble a legal reasoning process. Additionally, I contributed to research exploring an LLM-as-a-Judge framework, which aims to evaluate AI-generated legal responses and identify areas for improvement. Preliminary findings indicate that evaluating legal responses presents unique challenges, as legal questions often lack a singular correct answer; two valid, reasonable, and well-supported arguments may lead to opposing conclusions. This work highlights the complexities of integrating AI into the legal domain and underscores the importance of continued research. Advancements in this field hold significant potential to enhance access to justice and improve efficiency within the legal profession.
Li Qu (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: