What question did this study set out to answer?

The aim is to improve speaker gender recognition accuracy using a novel deep learning technique.

April 1, 2026

Speaker Gender Recognition Using Optimized Multi-Component Attention Graph Convolutional Neural Network with EfficientNetB7

Key Points

The aim is to improve speaker gender recognition accuracy using a novel deep learning technique.
Utilized the Mozilla Common Voice dataset for data collection.
Employed Multi-Component Attention Graph Convolutional Neural Network for feature extraction.
Integrated EfficientNetB7 with MAGCNN for gender classification.
Applied the Shrike Optimization Algorithm to optimize model parameters.
The proposed method achieved improved accuracy and precision compared to traditional approaches.
Demonstrated robustness against noisy environments and ambiguous voices.

Abstract

Speaker gender recognition (SGR) identifies a speaker’s gender from voice characteristics and is used in speech synthesis, voice assistants and human–computer interaction. Traditional methods only rely on features like pitch, whereas recent approaches use deep learning for better accuracy. However, some challenges remain, such as robustness in noisy environments, handling ambiguous voices and achieving high accuracy across languages. Model bias and ethical concerns pose obstacles to real-world deployment. To address these drawbacks, this paper proposes a Speaker Gender Recognition using Optimized Multi-Component Attention Graph Convolutional Neural Network with EfficientNetB7 (SGR-MAGCNN-EffNetB7) technique. Here, data collected through the Mozilla Common Voice dataset are used. The collected data are fed into the feature extraction stage with the help of Multi-Component Attention Graph Convolutional Neural Network (MAGCNN). The extracted features are given to the EfficientNetB7 for identifying the speaker gender as male and female. EfficientNetB7 is integrated by replacing the convolutional layer of MAGCNN, while retaining its dense layers for classification. Finally, the shrike optimization algorithm (SHOA) is proposed for optimizing the weight parameters of MAGCNN-EffNetB7. The simulation outcomes demonstrate that the proposed SGR-MAGCNN-EffNetB7 approach achieves better accuracy and better precision when compared to the existing methods.

Bookmark

Speaker Gender Recognition Using Optimized Multi-Component Attention Graph Convolutional Neural Network with EfficientNetB7

Key Points

Abstract

Cite This Study