This technical report documents the implementation of key infrastructure improvements to the Pleiadi@IRA HPC cluster, focusing on energy efficiency and resource monitoring. The primary improvements include: (1) an automatic power management system that dynamically controls compute node power states based on workload demand, achieving a 61% reduction in idle power consumption; and (2) a comprehensive monitoring infrastructure using Slurm accounting data exported to Prometheus and visualized through Grafana dashboards.
Building similarity graph...
Analyzing shared references across papers
Matteo GANDOLFI
Francesco BEDOSTI
Elia Gasperini
Building similarity graph...
Analyzing shared references across papers
GANDOLFI et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69bb9336496e729e62981330 — DOI: https://doi.org/10.20371/inaf/techrep/374