BACKGROUND: New food databases increasingly provide biochemical information not yet captured in standard food composition databases (FCDs). To enable precision nutrition, new methods are needed to map foods to these FCDs. OBJECTIVE: We sought to provide real-world ground truth (benchmark) datasets and evaluate the use of large language models (LLMs) to match foods reported in dietary data with foods in FCDs. METHODS: Two ground truth (benchmark) datasets were developed. ASA24-to-FooDB included a large FCD (9,910 entries) with many similar or perfect matches. NHANES-to-DFG2 included a small FCD (256 entries) with imperfect matches or "No Match" (46.9%). Matching methods tested included fuzzy matching, TF-IDF, semantic embedding, and LLMs. RESULTS: Food text description mapping using similarity scores from semantic embedding performed better on both ground truth datasets (87.8% accuracy, ASA24-to-FooDB; 48.0% accuracy, NHANES-to-DFG2) than fuzzy matching or TF-IDF. LLMs performed worse on ASA24-to-FooDB when given the entire FCD, but better on NHANES-to-DFG2 (62.6% accuracy). For foods where a correct match exists, semantic similarity yielded top K accuracies of 85% at k=5, 95% at k=10 for ASA24-to-FooDB and 96% at k=5, 98% at k=10 for NHANES-to-DFG2. A hybrid approach using semantic embeddings to select the top K matches to prompt LLMs yielded overall accuracies of 90.7% on ASA24-to-FooDB and 65.4% on NHANES-to-DFG2. An investigation of different prompt strategies and model sizes demonstrated that simpler prompts worked better for larger LLMs while smaller LLMs needed detailed instructions. To assist nutrition scientists, the best strategy (semantic mapping + LLM reranking) was implemented in an application: FoodMapper (https://foodmapper.app/). CONCLUSIONS: To match food text descriptions to FCDs, identifying top matches using semantic similarity followed by an LLM to choose from among those matches or "no match" resulted in the highest accuracy. FoodMapper provides users with the best solution in a user-friendly interface that facilitates manual review.
Lemay et al. (Mon,) studied this question.