What question did this study set out to answer?

The central aim is to evaluate and compare transformer-based NLP models for detecting corruption risk indicators in procurement texts.

April 1, 2026Open Access

AI-Driven Corruption Risk Indicator Detection: A Comparative Evaluation of Transformer-Based NLP Models in Unstructured Procurement Data

Key Points

The central aim is to evaluate and compare transformer-based NLP models for detecting corruption risk indicators in procurement texts.
Constructed a unified dataset linking unstructured documentation with structured procurement outcomes.
Evaluated three NLP architectures: BERT, RoBERTa, and DeBERTa.
Used metrics such as precision, recall, F1-score, and ROC-AUC for performance evaluation.
Conducted explainability analysis utilizing Integrated Gradients.
Demonstrated a clear progression in performance among the transformer models evaluated.
Highlighted comparative strengths of BERT, RoBERTa, and DeBERTa for the task.
Results indicate effective model capabilities for anti-corruption monitoring.

Abstract

The detection of corruption-related indicators within unstructured, textual procurement data remains a complex task due to linguistic ambiguity, contextual variation and domain-specific terminology. This study presents a comparative evaluation of three transformer-based Natural Language Processing (NLP) architectures (BERT-base-uncased, RoBERTa-base and DeBERTa-v3-base) for automated corruption risk indicator detection in procurement texts coming from heterogeneous sources. A unified dataset is constructed by linking unstructured technical documentation with structured procurement outcomes, enabling an outcome-driven risk labeling strategy. Performance evaluation is conducted through different metrics, including precision, recall, F1-score and ROC-AUC, complemented by explainability analysis using Integrated Gradients. The results demonstrate a clear performance progression and highlight the comparative strengths of the evaluated architectures. Overall, this study highlights the potential of contextual transformer models to support scalable, transparent and operational anti-corruption monitoring systems.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper