Los puntos clave no están disponibles para este artículo en este momento.
With the exponential growth of the life sciences literature, biomedical text mining (BTM) has become an essential technology for accelerating the extraction of insights from publications. The identification of entities in texts, such as diseases or genes, and their normalization, i.e. grounding them in knowledge base, are crucial steps in any BTM pipeline to enable information aggregation from multiple documents. However, tools for these two steps are rarely applied in the same context in which they were developed. Instead, they are applied "in the wild", i.e. on application-dependent text collections from moderately to extremely different from those used for training, varying e.g. in focus, genre or text type. This raises the question whether the reported performance, usually obtained by training and evaluating on different partitions of the same corpus, can be trusted for downstream applications.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mario Sänger
Humboldt-Universität zu Berlin
Samuele Garda
Humboldt-Universität zu Berlin
Xing David Wang
Humboldt-Universität zu Berlin
Bioinformatics
Ludwig-Maximilians-Universität München
Humboldt-Universität zu Berlin
Building similarity graph...
Analyzing shared references across papers
Loading...
Sänger et al. (Tue,) studied this question.
synapsesocial.com/papers/68e5832cb6db6435875203dc — DOI: https://doi.org/10.1093/bioinformatics/btae564