Knowledge tracing, the computational modeling of student learning progression through sequential educational interactions, represents a critical component for adaptive learning systems and personalized education platforms. However, existing approaches face a fundamental trade-off between predictive accuracy and interpretability: deep sequence models excel at capturing complex temporal dependencies in student interaction data but lack transparency in their decision-making processes, while probabilistic graphical models provide interpretable causal relationships but struggle with the complexity of real-world educational sequences. We propose a hybrid architecture that integrates transformer-based sequence modeling with structured Bayesian causal networks to overcome this limitation. Our dual-pathway design employs a transformer encoder to capture complex temporal patterns in student interaction sequences, while a differentiable Bayesian network explicitly models prerequisite relationships between knowledge components. These pathways are unified through a cross-attention mechanism that enables bidirectional information flow between temporal representations and causal structures. We introduce a joint training objective that simultaneously optimizes sequence prediction accuracy and causal graph consistency, ensuring learned temporal patterns align with interpretable domain knowledge. The model undergoes pre-training on 3.2 million student–problem interactions from diverse MOOCs to establish foundational representations, followed by domain-specific fine-tuning. Comprehensive experiments across mathematics, computer science, and language learning demonstrate substantial improvements: 8.7% increase in AUC over state-of-the-art knowledge tracing models (0.847 vs. 0.779), 12.3% reduction in RMSE for performance prediction, and 89.2% accuracy in discovering expert-validated prerequisite relationships. The model achieves a 0.763 F1-score for early at-risk student identification, outperforming baselines by 15.4%. This work demonstrates that sophisticated temporal modeling and interpretable causal reasoning can be effectively unified for educational applications.
Tam et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: