The rapid growth of open and structured RDF data on the Web has promoted the development of dataset search as an important research topic. The core function of existing systems is ad hoc dataset retrieval (AHDR) based on the metadata of datasets, which contains limited information and often suffers from quality issues. To overcome the limitations, in this article, we systematically investigate content-based AHDR to exploit the actual RDF data in datasets. We address three main tasks of content-based AHDR with novel methods for handling the large size and complex structure of RDF data to facilitate dataset retrieval, deduplication, and snippet extraction. These methods are integrated into an online and open-source prototype called Caddie . The effectiveness and practicability of its components are evaluated on a public test collection and by a user study.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xiaxia Wang
Qiaosheng Chen
Nanjing University
Qing Shi
Nanjing University
Journal of Web Semantics
University of Oxford
University of Edinburgh
University of Oslo
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Tue,) studied this question.
synapsesocial.com/papers/699fe2eb95ddcd3a253e65a1 — DOI: https://doi.org/10.1016/j.websem.2026.100878
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: