What question did this study set out to answer?

This review aims to systematically evaluate open datasets available for Precision Livestock Farming and their applications in the field.

May 25, 2026Open Access

Open Datasets and AI/GenAI-Driven Computer Vision for Precision Livestock Farming: A Bidirectional Review

Key Points

This review aims to systematically evaluate open datasets available for Precision Livestock Farming and their applications in the field.
Conducted bidirectional searches of Scopus/Web of Science and digital repositories from 2010 to 2025.
Identified 315 open datasets spanning multiple livestock species, including cattle, swine, and poultry.
Analyzed the distribution of datasets and literature, noting trends in AI-driven computer vision applications.
63% of datasets (n=199) were published in peer-reviewed literature, whereas 37% (n=116) were found in standalone repositories.
Most datasets focused on cattle (n=131), followed by swine (n=59) and poultry (n=43).
AI approaches, especially object detection using YOLO architectures and GANs, have rapidly evolved to improve livestock monitoring.

Abstract

The advancement of Precision Livestock Farming (PLF) depends on high-quality data, yet a systematic understanding of the open data landscape remains fragmented. This review adopts a bidirectional perspective, evaluating both open datasets and the researchers who cited them, with a focus on research objectives and practical applications. Through bidirectional searches of Scopus/Web of Science and digital repositories, 315 open datasets were identified between 2010 and 2025, spanning dairy cows, beef cattle, pigs, poultry, and other species. The majority were released within the last five years, signalling a remarkable data explosion. Peer-reviewed literature remains the primary dissemination engine 63% (n=199), while standalone repositories contribute 37% (n=116), reflecting a shift toward data-first scientific contributions. Species distribution is skewed toward cattle (n=131), followed by swine (n=59) and poultry (n=43). Non-AI computer vision relies on deterministic algorithms exploiting the physical and spectral properties of animal images. Among AI approaches, object detection dominates livestock monitoring, with YOLO architectures and Region-based Convolutional Neural Networks leading the field. Generative AI — particularly Foundation Models (FM) and Generative Adversarial Networks (GANS) — has mitigated the scarcity of labelled data, superseding manual annotation through automated frameworks such as the Accelerated Data Engine. These resources are evolving into the backbone of Large Language Models (LLMs) and visual-language frameworks, enabling herd reasoning and predictive diagnostics, marking the transition from reactive to proactive, generative monitoring. Integrating datasets across biometric identification, health, and behavioural clusters advance food security and animal welfare. Nevertheless, gaps in dataset diversity and standardisation hinder reproducibility, demanding an ethical shift toward data sustainability and computational efficiency.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper