What does this research mean for the field?

The Adaptive Extremely Random Trees via Reinforcement Learning (AERT-RL) model achieves approximately 90% accuracy in predicting demographic age ranges from categorical voice data. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The aim is to develop a method for predicting demographic age ranges from verbal data using machine learning techniques.

March 8, 2026

Demographic age range detection using quantized categorical verbal data

Key Points

The aim is to develop a method for predicting demographic age ranges from verbal data using machine learning techniques.
Converted categorical verbal data from voice assistants to numerical data for analysis.
Applied Borderline-Synthetic Minority Oversampling Technique to manage class imbalance.
Utilized mutual information to identify effective features for the model.
Developed the Adaptive Extremely Random Trees via Reinforcement Learning model and compared it with seven ensemble models.
The AERT-RL model achieved approximately 90% accuracy in demographic age range predictions.
Kappa statistic of 0.86 indicates a high level of agreement in classification efficacy.
F-score of 0.89 reflects the model's balance between precision and recall in predicting age ranges.

Abstract

Abstract Demographic age range prediction from voice assistant data is crucial for developing personalized user applications. In this context, this study aims to introduce the Adaptive Extremely Random Trees via Reinforcement Learning (AERT-RL) model, a new method that uses adaptive hyperparameter optimization to improve prediction performance. In this paper, categorical verbal data from voice assistants was first converted to numerical representations to facilitate processing. Borderline-Synthetic Minority Oversampling Technique was then used to address class imbalance. Also, mutual information (MI) was then used to identify effective features. Seven ensemble models were run during the classification phase, and the results were compared. Ultimately, the proposed new AERT-RL model outperformed all classifiers with ~90% accuracy, with Kappa (0.86) and F-score (0.89). The results of the study demonstrate that reinforcement learning overcomes the limitations of traditional optimization techniques, enabling adaptive and robust parameter optimization of the models. These results also demonstrate that the integration of the three stages—data quantization, MI-based feature extraction, and the developed AERT-RL model—achieves more effective performance outcomes. Briefly, this research presents an efficient computational model for demographic age range detection from categorical voice data. Furthermore, the proposed AERT-RL model will be a powerful alternative to traditional natural language processing methods.

Bookmark

Demographic age range detection using quantized categorical verbal data

Key Points

Abstract

Cite This Study