Abstract Objectives: To propose a novel deep learning based Natural Language Processing (NLP) model for automated epilepsy diagnosis using clinical Electronic Health Record (EHR) data. The core objective of this research is to extract domain-specific features from an EHR dataset to enhance the model's decision-making by accurately classifying patients based on contextual cues. Methods: Deep learning based Enhanced Bidirectional Long Short-Term Memory (BiLSTM) is employed to learn contextual patterns from past and future directions. NLP pre-processing techniques, such as Tokenization (TK), Lemmatization (LM), and Named Entity Recognition (NER), extract domain-specific features from unstructured EHR notes. To represent the clinical terms in a deep contextual format, ClinicalBERT embeddings are generated and used to train the model. EHR clinical data were sourced from MIMIC-III/IV datasets, which consist of 60,000 ICU patients and millions of clinical note records. In addition, the EEG signal from the CHB-MIT Scalp extracted from the PhysioNet dataset was used to support thorough validation. The model also employed cross-entropy loss and was optimized using the Adam optimizer. The performance of NLP-BiLSTM is evaluated using MATLAB and TensorFlow, and the results are compared with prevailing machine learning models such as XGBoost (XGB), Random Forest (RF), and Support Vector Machine (SVM). Findings: The proposed NLP-BiLSTM model outperformed with promising results in all performance metrics with a prediction accuracy of 92.4%, precision of 91.4%, F1-score of 92.4%, sensitivity of 91.4%, specificity of 91.6%, and an AUC-ROC of 0.89, which significantly elevates the overall performance level compared to baseline machine learning-based epileptic discrimination models. Novelty: This computational research integrates deep NLP-derived contextual embeddings with BiLSTM architecture for interpreting free-text EHR data, enabling real-time, automated, and clinically interpretable epilepsy diagnosis without relying on manual reviews or solely structured inputs. Keywords: Epilepsy Detection, Bidirectional LSTM, Deep Learning, Natural Language Processing, Electronic Health Record, Automated Diagnosis
Leeshma et al. (Tue,) studied this question.