ABSTRACT The majority of individuals in the modern world struggle with psychological sickness, which also contributes to a wide variety of physical weaknesses. Depression has become a prevalent disease that affects the mental state of individuals, and most people are suffering from it due to a lack of diagnosis and awareness. Sometimes, depression may lead to sleeping disorders, self‐harm, and recklessness. For recognizing the depressed people, numerous researchers have developed machine learning (ML) based frameworks through interviews with patients. However, those methods have inaccurate detection accuracy and possess poor efficiency. Thus, it is very challenging to identify individuals who are affected by depression. So, a novel deep learning (DL) assisted depression detection framework is proposed in this research on the DAIC‐WOZ dataset. The pre‐processed audio signals are augmented using a hybrid Synthetic Minority Oversampling Technique (SMOTE) assisted Conditional Generative Adversarial Network (cGAN) to increase the number of samples. A novel dilated convolutional integrated absolute encoding based vision transformer with Frequency Self‐attention (DCA‐AViT) is introduced for extracting deep features. Further, handcrafted extracted features are fused using a soft attention layer to provide better feature representations. For classifying the depression, a locality attention based fully connected Shuffle‐Net (LA‐FCS) is proposed, which detects the psychological distress conditions. Experimental findings of the proposed depression detection attained better performance improvements in accuracy of 98.40%, precision of 98.45%, recall of 98.41%, specificity of 99.58%, and F1‐Score of 98.43% which demonstrated the efficacy of the proposed model. Moreover, performance validation of the proposed technique outperformed all existing baseline models in depression detection.
Kumar et al. (Wed,) studied this question.