What question did this study set out to answer?

The research aims to create an adaptive method for assessing Chinese teaching effectiveness using multimodal machine learning.

March 31, 2026Open Access

Comprehensive evaluation of Chinese teaching effectiveness supported by multimodal machine learning

Key Points

The research aims to create an adaptive method for assessing Chinese teaching effectiveness using multimodal machine learning.
Developed a hybrid model named WHBadger-CatBoost for assessment.
Constructed a multimodal dataset of 1000 classroom samples from various inputs.
Applied audio enhancement and text pre-processing for accurate data analysis.
Utilized PCA for embedding speech dynamics and engagement indicators.
Implemented optimization techniques for improved classification accuracy.
Achieved 0.95 accuracy in predicting teaching effectiveness.
Demonstrated enhanced reliability in educational assessments.
Supported continuous improvement in instructional methodologies through data-driven insights.

Abstract

Chinese language education is increasingly vital for global exchange and collaboration. Traditional evaluation methods relying on subjective observation and basic metrics overlook multimodal classroom dynamics, while existing automated models lack adaptability and accuracy in complex, data-driven learning environments. To develop and adaptive method capable of accurately assessing Chinese teaching effectiveness through machine learning-based multimodal analysis. A hybrid optimization and classification model named Weighted Honey Badger Optimized Categorical Boosting (WHBadger-CatBoost) is introduced, integrating metaheuristic optimization with gradient boosting to enhance predictive performance and interpretability. The WHBadger-CatBoost-driven multimodal framework combines video, audio, and textual cues to measure engagement, comprehension, and instructional quality. A multimodal dataset comprising 1000 classroom samples was constructed from video recordings, audio interactions, images, and student-written submissions, representing diverse learning contexts and participant demographics. Spectrogram enhancement using min–max scaling was applied to audio data; tokenization and stop-word removal were employed for text; and frame sampling with resizing refined visual inputs. Principal Component Analysis (PCA)-based embedding captured speech dynamics, facial expressions, and gesture-based engagement indicators. The WHBadger algorithm optimized CatBoost’s hyperparameters, balancing exploration and exploitation for enhanced classification accuracy while minimizing overfitting. Python with Scikit-learn, CatBoost, and NumPy environments. The model achieved a superior performance, 0.95 in accuracy, demonstrating its reliability in educational assessment. The proposed multimodal ML framework enables comprehensive, adaptive, and data-driven evaluation of Chinese teaching effectiveness, supporting continuous improvement in instructional methodologies.

Bookmark

View Full Paper

Cite This Study

Yaling Tang (Sun,) studied this question.

synapsesocial.com/papers/69cb645fe6a8c024954b8a30 https://doi.org/https://doi.org/10.1007/s44163-026-01139-w

Bookmark

View Full Paper