What question did this study set out to answer?

The research aims to create a standardized framework for optimizing feature representation and model selection in microbiome sequencing data.

February 6, 2026Open Access

Target-driven optimization of feature representation and model selection for microbiome sequencing data with ritme

Key Points

The research aims to create a standardized framework for optimizing feature representation and model selection in microbiome sequencing data.
Developed the ritme software package for combined algorithm selection and hyperparameter optimization.
Systematically explored feature engineering methods: taxonomic aggregation, sparsity-aware selection, compositional transforms, and metadata enrichment.
Applied ritme to three real-world microbiome datasets to assess performance against existing pipelines.
Ritme outperforms original study pipelines and generic AutoML baselines in predictive tasks.
Provided insights into the impact of feature and model choices on predictive performance.
Demonstrated the effectiveness of combine approaches in enhancing model accuracy.

Abstract

Microbiome sequencing datasets are sparse, high-dimensional, compositional, and hierarchically structured. Predictive modelling from these data typically relies on ad hoc choices of feature representation, obscuring their impact on performance and biological interpretation. A standardized, compute-efficient framework is needed to jointly optimize microbial feature representation and model algorithms with transparent model evaluation. Here, we present ritme, an opensource software package implementing Combined Algorithm Selection and Hyperparameter Optimization tailored to microbial sequencing data. ritme systematically explores feature engineering methods — taxonomic aggregation, sparsity-aware selection, compositional transforms, and metadata enrichment — alongside diverse model classes using state-of-the-art optimizers and model trackers. Applied to three real-world use cases, ritme outperforms original study pipelines and generic AutoML baselines. It further provides users with insights into how feature and model choices drive predictive performance. Together, these results establish ritme as a standardized framework for identifying optimal feature-model combinations from high-throughput sequencing data. ritme is an open-source Python package available at https://github.com/adamovanja/ritme.

Read Full Paperexternally

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Cite This Study

Adamov et al. (Tue,) studied this question.

synapsesocial.com/papers/698586498f7c464f2300a44f https://doi.org/https://doi.org/10.3929/ethz-c-000792998

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Discussion

References and Citations

Add This Paper to Your Research Feed

Any time a new paper drops it will be there.

Target-driven optimization of feature representation and model selection for microbiome sequencing data with ritme

Key Points

Abstract

Citation Network

Connected Papers

Cite This Study

Also Consider

Discussion

References and Citations

Also Consider