ExTTNet: A Deep Learning Algorithm for Extracting Table Texts from Invoice Images

doi:10.3390/math14132258

DOI: 10.3390/math14132258 ISSN: 2227-7390

ExTTNet: A Deep Learning Algorithm for Extracting Table Texts from Invoice Images

Adem Akdoğan, Murat Kurt

Product tables in invoices are obtained autonomously via a deep learning model named ExTTNet. Text information is first extracted from invoice images using Optical Character Recognition (OCR) techniques; both the Tesseract OCR engine and PaddleOCR were evaluated to determine the most effective method. Based on comparative analysis, PaddleOCR was selected due to its superior runtime performance, particularly with GPU acceleration, and its deep learning-based feature extraction capabilities. Each OCR token is labelled as a table element or a non-table element, and a multilayer artificial neural network is trained on the enriched feature set. Training was carried out on an Nvidia RTX 3090 graphics card in 62 min. The trained model achieves a macro-averaged F1 score of 0.91 and a class-1 F1 score of 0.90 on a private German invoice dataset, and a macro-averaged F1 score of 0.997 on the public FATURA benchmark.

Outline

ExTTNet: A Deep Learning Algorithm for Extracting Table Texts from Invoice Images

More from our Archive