Preprocessing data is an essential task in real-world data analysis, particularly for AI-driven applications in modern communication systems, where data quality directly impacts learning efficiency and decision-making accuracy. The existence of missing values (MVs) is a common issue when dealing with datasets collected from communication networks and distributed intelligent environments. Therefore, missing values in a dataset should be handled using appropriate imputation methods to improve the performance and accuracy of data mining and artificial intelligence models. Missing values must be treated carefully during the preprocessing stage to ensure reliable and trustworthy AI-based communication services. To this end, this paper proposes a novel technique aimed at obtaining high-quality data by effectively handling missing values in the dataset under consideration. The proposed algorithm primarily relies on linear regression and further benefits from clustering techniques to group closely related instances, which enhances the precision of the imputation process. The performance of the proposed imputation method is evaluated using four datasets with varying sizes and missing value ratios, generated under three different missingness mechanisms: Missing Not at Random (MNAR), Missing at Random (MAR), and Missing Completely at Random (MCAR). The proposed method is compared with existing imputation techniques in terms of mean absolute error (MAE), root-mean-square error (RMSE), and the coefficient of determination ( R 2 score). Experimental results demonstrate that the proposed method requires less computational time while achieving higher accuracy, making it suitable for data preprocessing in AI-enabled communication and intelligent network environments.
Mostafa et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: