Language models enable cutting-edge solutions for many problems. However, they may not always be the best choice—at least not on their own—for certain tasks in specific contexts. In this paper, we propose a hybrid approach to entity linking (EL) that employs domain knowledge and efficient indexes for named entity recognition (NER), delegating only the disambiguation step (NED) to language models. We evaluated this hybrid approach on textual descriptions of invoice items from public medication purchases. The experiments showed that domain knowledge and indexes enabled efficient recognition of medications (NER), with accuracy superior to most state-of-the-art language models investigated and comparable to the GPT-4o reasoning language model. In addition, candidate medications recognized by our computationally efficient approach were disambiguated (NED) by GPT-4o with 90.55% precision.
Albuquerque et al. (Mon,) studied this question.