What question did this study set out to answer?

This research aims to develop an automatic pipeline, Analogy2KG, for converting long-text analogies into knowledge graphs (KGs) while maintaining their analogical structure.

March 19, 2026Open Access

Analogy2KG: An automatic pipeline for deriving knowledge graphs from long-text analogies

Key Points

This research aims to develop an automatic pipeline, Analogy2KG, for converting long-text analogies into knowledge graphs (KGs) while maintaining their analogical structure.
Proposed an automatic pipeline for converting textual analogies to knowledge graphs.
Modified information extraction methods to preserve analogical structure.
Validated the pipeline using paired samples tests and evaluated graph density.
Compared performance against three LLM-enabled information extraction algorithms.
Introduced the RattermannKG and WhartonKG datasets, marking the first time long-text analogies are converted into KGs.
Demonstrated that Analogy2KG maintains analogical structure effectively during the conversion process.
Showed superior performance of Analogy2KG over LLM-enabled algorithms in maintaining analogical integrity.

Abstract

• Proposal of Analog2KG , a pipeline for turning textual analogies into knowledge graphs • Knowledge-graph version of 2 long-text analogy datasets, RattermannKG and WhartonKG • Modification of information extraction methods for maintaining analogical structure • Introduction of an LLM-free discovery methodology for higher-order relationships • Comparison to 3 LLM-enabled information extraction algorithms Analogical reasoning is an increasingly popular, lightweight solution to enable large language model (LLM)-level reasoning without computational complexity. Still, it has yet to be adopted due to its reliance on strictly hand-formatted data. Therefore, we propose Analogy2KG (“Analogy to Knowledge Graph’’), as an automatic pipeline that transforms text into a KG format via a fine-tuned version of information extraction (IE) algorithms for long-text analogies. The need to verify that the complex underlying analogical structure of the data is maintained was done via paired samples tests in the creation and validation of this pipeline. Graph density was used to evaluate the structural quality of the resulting KGs. Lastly, causal relationships were optionally detected using a novel, question-and-answer-based method. Analogy2KG was validated on the Rattermann and Wharton long-text datasets, which suggested that the proposed methodology maintains analogical structure when transforming from text to KGs. The resulting RattermannKG and WhartonKG datasets were introduced to the literature, which is the first instance of a the conversion of long-text analogy dataset into a KG format in the literature. Finally, Analogy2KG had superior performance among three LLM-enabled information extraction algorithms: ChatIE, Code4UIE, and InstructUIE for maintaining analogical structure, despite operating without the need for an LLM backend and a pre-defined relation extractor list; thus, making it an ideal lightweight solution.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Combs et al. (Sun,) studied this question.

synapsesocial.com/papers/69bb91c7496e729e6297f264 https://doi.org/https://doi.org/10.1016/j.knosys.2026.115772

Bookmark

View Full Paper