What question did this study set out to answer?

The aim is to develop an automated system using large language models for managing branded food data efficiently.

June 18, 2026Open Access

Leveraging large language models to maintain a branded food product database

Key Points

The aim is to develop an automated system using large language models for managing branded food data efficiently.
Developed a pipeline using fine-tuned large language models for data collection and standardization.
Compared automated LLM performance against human experts for data parsing and mapping.
Assessed the impact of fine-tuning on model performance with varying amounts of fine-tuning data.
Fine-tuned large language models outperformed human experts in data processing accuracy.
Modest fine-tuning significantly improved LLM performance, leading to enhanced data quality.
Overall, LLMs provided a scalable solution for processing diverse branded food data.

Abstract

Abstract Branded food data are essential for assessing contemporary dietary behavior and the global food environment. However, processing such data is challenging due to its vast, rapidly changing nature, variable quality, and numerous sources. To address these limitations, we developed a fully automated, large language model (LLM)-powered pipeline for collecting, standardizing, and enriching branded food data, enabling ingredient-level analyses, and facilitating estimation of ingredient quantities and undeclared nutrient content. Evaluation of LLM performance demonstrated that a fine-tuned model outperformed the human experts in parsing and mapping product data. Non-fine-tuned LLMs showed insufficient performance, whereas even modest amounts of fine-tuning data substantially improved results. Overall, LLMs provide a scalable approach for processing branded food data and supporting more consistent and standardized data curation, with performance exceeding that of an individual human expert. These results highlight the potential of LLMs to transform the management and analysis of complex, large-scale food databases.

Mark Helpful

Bookmark

Relay

View Full Paper