In this report, we present the update of the software development plan and discuss the main novelties, concepts, and needs that have arisen in our community and groups. The Software development plan is still current, on schedule, and effective; the Lighthouse developers propose no significant updates. Regarding the discussion of new needs, concepts, and actions planned for the future, one crucial point concerns the emerging technologies and architectures for which developers are already planning to port updates.To the extent possible, these plans should be integrated into the current MaX work on the codes. Another point of concern is the arrival on the market of GPUs with massive throughput at lower precision. This has attracted the attention of WP1 developers for two main reasons: On the one hand, they want to prevent the impoverishment of the computational power in future GPU-accelerated machines; on the other hand, finding the way to exploit this throughput will give advantages in terms of energy efficiency and may enable the usage of user entry GPU, such as cards used for gaming or AI inference. All groups are thus experimenting on these aspects, collecting use cases, and staying on the lookout for any upcoming vendor libraries, which will very likely be the most efficient solution for exploiting low-precision GPU cores. One common update to the plans is the development, collaboration with the community, and other types of support for the EasyBuild package manager. This support builds upon the existing support for Spack that was already present in the first SDP. The reason for this addiction is primarily due to the clear focus of WP1 developers to address – and deploy on – EuroHPC machines, that have in many cases adopted EasyBuild as the default package manager and that will be very likely the pivot around which to leverage for the distributed CI/CD in these systems, which is a long time target of WP1 together with WP3. • QUANTUM ESPRESSO updates on optimizing GPU acceleration for small-size calculations; improving FFTXlib communication efficiency through NCCL integration and adaptive batching; refactoring band parallelism with non-blocking MPI communications; enhancing build systems and CI/CD pipelines. • SIESTA updates focus on enhancing ELPA library integration and mixed-precision arithmetic for reduced-precision hardware; improving parallel efficiency by optimising MPI-GPU balance using MPS/NCCL frameworks and developing configurable MPI teams for k-point/spin parallelisation. Additional planned work includes streamlined deployment through Spack/EasyBuild and enhanced interoperability features, such as improved I/O handling and API enhancements. • FLEUR updates aims at the enhancement of maintainability, scalability and add scientific capabilities primarily based on linear response. The plan to add to the code mixed-precision arithmetic, refactoring some program internals as the charge density generation, and the GPU acceleration of their newly introduced DFPT features. • BIGDFT focuses its updates on improving MPI communications, adding and implementing non-blocking reduction. They have also updated their plan giving increasingly more room to SYCL-based offloading. • YAMBO main updates are on the refactoring of compute-intensive kernels (such as the calculation of the irreducible polarizability) by using level 3 BLAS operations, in turn also enabling the future exploitation of reduced precision HW; new APIs for distributed linear algebra; the YamboPy interoperability tool; new optimized property calculators for the scientific workflow such as self-consistent GW, electron-ion dynamics and convergence accelerators.
Delugas et al. (Mon,) studied this question.