Key points are not available for this paper at this time.
Invoice processing is a time-consuming and tedious task that can be automated using optical character recognition (OCR) technology. Tesseract is a popular open-source OCR engine that can be used to extract text from scanned invoices. In this paper, we propose a method for invoice processing using Tesseract OCR. The method involves pre-processing the image of the invoice to remove noise and improve the quality of the text. The pre-processed image is then passed to Tesseract OCR to extract the text. The extracted text is then parsed to extract the relevant invoice information. The results showed that the method was able to extract the invoice information with high accuracy.
Deepa et al. (Thu,) studied this question.