Key points are not available for this paper at this time.
As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.
Building similarity graph...
Analyzing shared references across papers
Loading...
Michael Bada
University of Chicago
Miriam Eckert
University of Colorado System
Donald L. Evans
University of Colorado Boulder
BMC Bioinformatics
SHILAP Revista de lepidopterología
University of Colorado Boulder
University of Colorado Anschutz Medical Campus
Jackson Laboratory
Building similarity graph...
Analyzing shared references across papers
Loading...
Bada et al. (Mon,) studied this question.
synapsesocial.com/papers/69d776c4b1cb92dd1bb8b44e — DOI: https://doi.org/10.1186/1471-2105-13-161