What question did this study set out to answer?

The research aims to create a centralized system for the persistent management and tracking of scientific samples throughout their lifecycle.

May 1, 2026Open Access

SEPIA: A Metadata-Driven Infrastructure for Persistent and Traceable Sample Management

Key Points

The research aims to create a centralized system for the persistent management and tracking of scientific samples throughout their lifecycle.
Developed a metadata-driven system called SEPIA at Helmholtz-Zentrum Berlin.
Utilized PostgreSQL database, Flask-based REST API, and a Next.js frontend for system architecture.
Integrated capabilities for flexible metadata ingestion and alignment with the DataCite Metadata Schema.
SEPIA enables full tracking of samples, improving reproducibility and transparency in research.
Facilitates global identification of samples through persistent identifiers like IGSN.
Supports FAIR data principles, enhancing interoperability across research institutions.

Abstract

SEPIA (Sample Essentials, Persistent Identifiers & Attributes) is a metadata-driven infrastructure designed to enable comprehensive, persistent, and traceable management of scientific samples across their entire lifecycle. Developed at Helmholtz-Zentrum Berlin (HZB), SEPIA addresses common challenges in research data management, where sample metadata is often fragmented, incomplete, or poorly linked to experiments, datasets, and contributors. The system provides a centralized platform for capturing rich, structured metadata describing samples, including their provenance, attributes, relationships, and lifecycle events. By integrating persistent identifiers such as IGSN and aligning metadata with the DataCite Metadata Schema, SEPIA ensures global, unambiguous identification of samples and facilitates interoperability across systems and institutions. SEPIA is built on a scalable technical architecture consisting of a PostgreSQL database, a Flask-based REST API following OpenAPI specifications, and a modern web frontend implemented with Next.js and TypeScript. It integrates closely with ICAT and supports flexible ingestion of metadata through structured JSON, existing identifiers, and DataCite XML, allowing seamless integration into diverse scientific workflows. The platform enables full tracking of samples before, during, and after experiments, including modifications, location history, and links to investigations and datasets. This comprehensive tracking improves reproducibility, supports collaborative research, and enhances transparency in scientific processes. Aligned with FAIR data principles (Findable, Accessible, interoperable, Reusable), SEPIA transforms sample metadata into an active component of research infrastructure. It supports researchers, beamline scientists, and data managers by providing intuitive tools for registering, exploring, and maintaining sample metadata across beamlines, laboratories, and institutions. This poster was presented at the HMC Conference 2026 (Heidelberg) as part of an interactive demonstration, showcasing how persistent identifiers and well-defined metadata models can significantly improve data traceability, interoperability, and reuse in large-scale research environments.

SEPIA: A Metadata-Driven Infrastructure for Persistent and Traceable Sample Management

Key Points

Abstract

Cite This Study