Event Relation Extraction (ERE) plays a crucial role in understanding document structures by identifying relations between events. However, most existing methods either rely on single model instances, which often suffer from overconfidence, or adopt multi-agent frameworks that rely on manually designed prompts and heuristics to define agents, making effective optimization difficult. In this paper, we propose Debate to Extract (D2E), a novel multi-agent optimization framework for ERE that leverages structured multi-turn debates and specialized agent training to enhance performance. Specifically, to organize the debate, participants are divided into multiple groups, each assigned its own debate topic. This process effectively integrates both cooperation and confrontation. We also incorporate an audience as a crucial participant, whose conclusions, from an observer’s perspective, tend to be more objective. Building on this debate framework, D2E further optimizes the debate participants, combining structured multi-turn debates with agent training. During the debate, agents refine their initial opinions through collaborative interactions. This iterative process generates valuable supervision signals for training, allowing agents to improve their responses progressively. To address the issue of diminishing returns from data diversity, each agent is trained on distinct subsets of generated data, promoting specialization across different task dimensions. Experimental results on the MAVEN-ERE and EventStoryLine datasets show that D2E achieves significant improvements in causal relation extraction, outperforming baseline methods by 12.1% and 4.69%, respectively. Through further analysis of the debate participants’ performance before and after the debate, we find that participation in the debate generally leads to improved event relation extraction performance. This work demonstrates that combining collaborative debate with agent specialization leads to substantial performance gains in event relation extraction tasks.
Guan et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: