This article aims to encourage infrastructure-focused work units and teams with limited prior exposure to machine learning to actively explore current technologies and gain practical insight into their possibilities and limitations. The approach presented here can aid in the development of viable concepts and give a clearer understanding of technological feasibility, with a focus on practical solutions. To this end, the article presents a test scenario, applying Named Entity Recognition (NER) to abstracts in iDAI.bibliography – the catalogue of the DAI libraries. The approach uses a lightweight pipeline built on freely available models, a simple code base, standard hardware, and copyright-compliant methods, demonstrating how automated processing can meaningfully reduce human effort and improve the quality of the entries.
Peter Baumeister (Thu,) studied this question.