Auditing public procurement processes is essential to ensure transparency in the use of government resources. However, identifying and validating information about these processes is challenging, as it may contain inconsistencies and require manual review, which is both time-consuming and prone to error. To address this issue, our paper examines the application of Named Entity Recognition (NER) techniques for automating data extraction from public procurement-related texts. We developed and evaluated several models using encoders and decoders to identify relevant entities in unstructured texts from the Official Gazette of the Court of Auditors of the Union. Our results demonstrated that RoBERTa outperformed other models when combined with sampling and validation techniques, highlighting the effectiveness of NER in automating audits, reducing manual effort, and improving accuracy in government oversight.
Oliveira et al. (Thu,) studied this question.