This deliverable D4.2 presents the main results of GRAPHIA ’s work on understanding, evaluating, and preparing the technical foundations for the SSH Citation Index modules. Its central contribution is to show, through systematic analysis and empirical evaluation, that bibliographic reference processing in the Social Sciences and Humanities (SSH) raises challenges that are not adequately addressed by approaches developed primarily for STEM disciplines, and to identify concrete ways in which GRAPHIA can respond to these challenges. For reference extraction and parsing, the deliverable combines a detailed review of existing datasets and tools with an extensive benchmarking of large language models under SSH-typical conditions. By testing multilingual documents, footnote-heavy articles, and heterogeneous layouts, the work goes beyond standard journal-centric evaluations and provides clear evidence of where traditional supervised pipelines perform well and where they break down. It also shows that LLM-based methods can be a flexible and competitive alternative for SSH material when they are carefully guided through segmentation strategies and structured outputs. These results directly inform future technical choices for the SSH Citation Index, particularly with respect to robustness, scalability, and multilingual coverage. In the area of citation intent classification, the deliverable shows that existing models and datasets—largely developed for narrowly defined STEM domains—rest on assumptions that do not align well with SSH writing practices. SSH citations are often argumentative, interpretative, and distributed across longer stretches of text, which challenges prevailing annotation schemes and modelling strategies. By analysing these limitations and outlining prospects for SSH-specific taxonomies and datasets, the deliverable reframes citation intent classification as an opportunity to enrich SSH citations with semantic information that better reflects disciplinary practices, rather than as a simple task of transferring existing methods. For citation linking, the deliverable highlights why this task is particularly demanding in SSH: references frequently point to books and chapters without DOIs, appear in multiple languages, and rely on inconsistent or incomplete metadata. The analysis of existing resources indicates that no current benchmark adequately captures this complexity. As a result, the deliverable motivates and specifies the creation of a new SSH-oriented benchmark dataset, providing a concrete roadmap for evaluating linking methods against modern open bibliographic infrastructures. This work is essential for improving the reliability and coverage of citation links within the SSH Knowledge Graph. Overall, the deliverable establishes a coherent methodological baseline for the SSH Citation Index modules within GRAPHIA. By combining state-of-the-art review, targeted benchmarking, and forward-looking dataset design, it supports informed technical decisions in subsequent work packages and lays the groundwork for sustainable, SSH-sensitive citation services that can be integrated into the broader GRAPHIA infrastructure. Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the Agency. Neither the European Union nor the granting authority can be held responsible for them.
Zhu et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: