Random Forest achieved an accuracy of 97.98% in classifying heartbeats compared to 96.75% accuracy using Gradient Boosted Trees.
Can machine learning algorithms (Gradient-Boosted Trees and Random Forest) accurately classify ECG heartbeats into normal and abnormal categories?
Machine learning algorithms, particularly Random Forest, implemented on a big data framework can classify ECG heartbeats with high accuracy, offering a potential tool for automated arrhythmia diagnosis.
Effect estimate: null (95% CI null)
Absolute Event Rate: 97.98% vs 96.75%
p-value: p=null
Abstract This study proposed an ECG (Electrocardiogram) classification approach using machine learning based on several ECG features. An electrocardiogram (ECG) is a signal that measures the electric activity of the heart. The proposed approach is implemented using ML-libs and Scala language on Apache Spark framework; MLlib is Apache Spark’s scalable machine learning library. The key challenge in ECG classification is to handle the irregularities in the ECG signals which is very important to detect the patient status. Therefore, we have proposed an efficient approach to classify ECG signals with high accuracy Each heartbeat is a combination of action impulse waveforms produced by different specialized cardiac heart tissues. Heartbeats classification faces some difficulties because these waveforms differ from person to another, they are described by some features. These features are the inputs of machine learning algorithm. In general, using Spark–Scala tools simplifies the usage of many algorithms such as machine-learning (ML) algorithms. On other hand, Spark–Scala is preferred to be used more than other tools when size of processing data is too large. In our case, we have used a dataset with 205,146 records to evaluate the performance of our approach. Machine learning libraries in Spark–Scala provide easy ways to implement many classification algorithms (Decision Tree, Random Forests, Gradient-Boosted Trees (GDB), etc.). The proposed method is evaluated and validated on baseline MIT-BIH Arrhythmia and MIT-BIH Supraventricular Arrhythmia database. The results show that our approach achieved an overall accuracy of 96.75% using GDB Tree algorithm and 97.98% using random Forest for binary classification. For multi class classification, it achieved to 98.03% accuracy using Random Forest, Gradient Boosting tree supports only binary classification.
Alarsan et al. (Sat,) conducted a other in Arrhythmia (n=205,146). Random Forest vs. Gradient Boosted Trees was evaluated on Classification accuracy (null, 95% CI null, p=null). Random Forest achieved an accuracy of 97.98% in classifying heartbeats compared to 96.75% accuracy using Gradient Boosted Trees.