Los puntos clave no están disponibles para este artículo en este momento.
We present FLIC, a real-world annotated dataset designed for the visual estimation of food leftovers in canteens and other collective catering environments using standard 2D RGB imagery. Collected over 22 days in an operational university canteen, the dataset includes 401 paired image acquisitions of full and leftover trays, each associated with pixel-precise semantic segmentation masks and physically measured food mass. The goal is to support research on the estimation of leftover food mass from tray images, a task that has received limited attention compared to pre-consumption food recognition, despite its relevance for sustainability and operational decision making in food services. Unlike existing food datasets, FLIC jointly provides paired before–after visual observations and reliable mass ground truth, enabling quantitative analysis of food leftovers under realistic conditions without relying on depth or multi-view information. To demonstrate the dataset’s applicability, we rely on the concept of digital density, relating pixel area to food mass, and implement a lightweight, interpretable baseline mass estimation pipeline. This includes an automatic food/no-food segmentation stage, evaluated across multiple deep learning models (U-Net, DABNet, DINOv2+FeatUp, and SAM), followed by an assisted food recognition stage that leverages the fixed daily menu to map broad user input (e.g., “first course” vs. “second course”) to a specific food class. Experimental results highlight both the potential and the intrinsic challenges of visual food leftover estimation.
Piccoli et al. (Sun,) studied this question.