Key points are not available for this paper at this time.
This paper proposes an online emotional feedback system for distance education based on the Vision Transformer (ViT). The objective is to provide teachers with realtime information on students' emotional states, such that the teaching strategies can be properly adapted to improve the quality of education. In the proposed system, the OpenCV is used to call the camera to collect the video in class, and the facial information is then detected by the Dlib. After that, the acquired facial images are processed with the ViT model augmented with two attentive pooling ViT (APViT) modules, to calculate students' emotions during class. Then, the emotional states of the students will be analyzed and repeatedly fed back to the teachers. It has been found from the simulations that the APViT model can achieve a training accuracy of 94.68% with the OL-SFED dataset, exhibiting the advantage of applying the ViT technique to the field of distance education.
Wang et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: