ExTTNet: A Deep Learning Algorithm for Extracting Table Texts from Invoice Images
Adem Akdoğan, Murat KurtProduct tables in invoices are obtained autonomously via a deep learning model named ExTTNet. Text information is first extracted from invoice images using Optical Character Recognition (OCR) techniques; both the Tesseract OCR engine and PaddleOCR were evaluated to determine the most effective method. Based on comparative analysis, PaddleOCR was selected due to its superior runtime performance, particularly with GPU acceleration, and its deep learning-based feature extraction capabilities. Each OCR token is labelled as a table element or a non-table element, and a multilayer artificial neural network is trained on the enriched feature set. Training was carried out on an Nvidia RTX 3090 graphics card in 62 min. The trained model achieves a macro-averaged F1 score of 0.91 and a class-1 F1 score of 0.90 on a private German invoice dataset, and a macro-averaged F1 score of 0.997 on the public FATURA benchmark.