What question did this study set out to answer?

This work aims to enhance feature selection methods using CUDA implementations based on conditional mutual information.

June 4, 2026Open Access

CUDA acceleration of feature selection methods based on conditional mutual information

Key Points

This work aims to enhance feature selection methods using CUDA implementations based on conditional mutual information.
Developed CUDA implementations for feature selection methods, focusing on the Interaction Capping method.
Evaluated performance across two systems with different CPU and GPU types using four publicly available datasets.
Utilized optimization techniques such as memory hierarchy and asynchronous computation to enhance GPU performance.
GPU implementations demonstrated significant speed improvements over multi-CPU systems in all scenarios tested.
Achieved high performance with relatively low energy costs, highlighting efficiency in large dataset processing.
Optimizations led to better exploitation of GPU architecture compared to traditional multithreaded approaches.

Abstract

Abstract Feature Selection (FS) is a fundamental data mining technique that improves machine learning models by eliminating irrelevant or redundant data, especially as dataset dimensions continue to grow. However, greedy FS methods can be computationally intensive, particularly for large datasets. This work focuses on the development of CUDA implementations for FS methods that use Conditional Mutual Information (CMI). Specifically, we address the Interaction Capping (ICAP) method and the family of algorithms that can be included in the BetaGamma space. The proposed implementations leverage memory hierarchy, asynchronous computation, and a reduced-precision approach to efficiently exploit the hardware of NVIDIA GPUs to achieve high performance with relatively low energy costs. The experimental evaluation has been carried out in two systems with different types and numbers of CPUs and GPUs using four publicly available datasets from different fields. We first provide insight into the performance impact of each optimization technique, and also into the combination of configuration parameters that best exploits the GPU architecture. Finally, the novel GPU implementations are compared to their fastest available counterparts (parallel implementations for multi-CPU systems), proving that the GPU versions are significantly faster than these multithreaded approaches in all scenarios using 32 or 64 CPU cores.

Bookmark

View Full Paper

Cite This Study

Beceiro et al. (Mon,) studied this question.

synapsesocial.com/papers/6a21164cd499ed480b16f4be https://doi.org/https://doi.org/10.1007/s10586-026-06076-y

Bookmark

View Full Paper