With a growing number of institutions involved in the global education market, it has become increasingly challenging to verify the authenticity of academic documents and to match them between universities. The paper introduces a deep Blockchain-based system to verify, transfer and match certificates, transcripts, and study plans. ARABBERTV2, pre-trained large language model (LLM) is used in this study to extract high-quality semantic representations from academic transcripts, which are then processed through dimensionality reduction and classification stages to detect equivalences and mismatches. To further enhance privacy and collaboration without sharing raw transcripts, Federated Learning (FL), is used to locally fine-tune a shared model under blockchain-coordinated aggregation. The evaluation was conducted on 661 collected study plans from Saudi universities, with 629 processed PDF documents used for training and testing. After dimensionality reduction to 41 principal components, the proposed model achieved 98.13% classification accuracy and a Kappa statistic of 0.9784. Integrating Federated Learning further improved performance, increasing accuracy from 93.5% (baseline) to 95.6%, and AUC-ROC from 0.947 to 0.972, while reducing inter-university performance variance. The findings demonstrate the efficiency of the proposed model and its importance in building such public framework in academic environments. The findings also demonstrate that Saudi universities primary plans differ from each other. The results recognize the contribution of the deep LLM characteristics to the production of perceptive categorization conclusions.
Alghamdi et al. (Tue,) studied this question.