March 3, 2026Open Access

A Privacy-Preserving Classification Framework for Multi-Class Imbalanced Data Using Geometric Oversampling and Homomorphic Encryption

Key Points

Prediction accuracy in ciphertext reached 93.44%, demonstrating effective data classification.
The G-MSMOTE method generates diverse data, addressing multiple minority class imbalances.
Neural networks perform classification tasks on encrypted data, ensuring user privacy and security.
Improvements made to traditional FV and CRT technology enhance coding efficiency in the process.

Abstract

Data classification tasks based on deep neural networks and machine learning are increasingly used in different fields, such as medicine, finance, and data circulation. However, in these applications, the accuracy of predictions must be guaranteed, and the privacy and security of prediction data and models must be guaranteed. In an unsafe cloud environment, cloud users are reluctant to use the classification prediction tasks provided by the cloud. To solve these problems, this paper researches the data oversampling method and proposes the G-MSMOTE method, which can solve the oversampling problem of multiple minority classes in the data set, generate more diverse data, and solve the data imbalance problem. By improving the traditional FV and using CRT technology to improve coding efficiency, the cloud receives the user’s encrypted ciphertext, and the neural network completes the data prediction task in the ciphertext, thereby providing confidentiality for user data and model parameters under the semi-honest adversarial model, assuming the security of the underlying fully homomorphic encryption scheme and accepting the leakage of model architecture and ciphertext sizes. The feasibility of our method was demonstrated through experimental comparative analysis. We created unbalanced cases based on the MNIST dataset and performed comparative analysis in plain and ciphertext. In the balanced dataset, the model’s prediction accuracy in ciphertext reached 93.44%. In the unbalanced case, after preprocessing with our improved G-MSMOTE algorithm, the model’s prediction accuracy in ciphertext increased by at least 10%. These results show that our scheme can efficiently, accurately, and securely (under the semi-honest model) complete the data classification prediction task.

Bookmark

View Full Paper

Cite This Study

Lu et al. (Tue,) studied this question.

synapsesocial.com/papers/69a75b72c6e9836116a22c15 https://doi.org/https://doi.org/10.3390/app16031283

Bookmark

View Full Paper