Multi-Criteria Financial Screening Under Data Uncertainty: An LLM-Extraction and Min–Max TOPSIS Approach for SMEs
Vinicius Minatogawa, Mitsuyoshi Fukushi, Jose Garcia, Jorge Rojas, Jose Gornall, Alfredo Angulo, Jefferson PintoSmall and medium enterprises routinely face a paradox in financial monitoring: their accounting documents exist, but the cost of converting heterogeneous PDFs into timely financial signals is prohibitive without dedicated analytical staff or specialized software. This paper presents a two-layer artifact, designed under Design Science Research, that bridges this gap using only public-web large language models (LLMs) and a parsimonious multi-criteria decision routine. Layer 1 implements a structured LLM-driven workflow that extracts account–value pairs from annual tax balance sheets without code, APIs, or fine-tuning. Layer 2 reconstructs auditable accounting aggregates and ranks yearly financial condition through TOPSIS with min–max normalization—a deliberate replacement for classical vector normalization, which fails when profitability indicators are negative, as routinely occurs in distress years. To avoid size effects and algebraic redundancy, the decision matrix uses only three criteria spanning liquidity, profitability, and solvency. The artifact is demonstrated in a four-year case study of an anonymized construction SME (2021–2024), with accountant-verified document-level match rates of 0.810, 0.998, 0.950, and 0.909. Equal weighting is the only weighting configuration used; a supplementary entropy-based dispersion diagnostic yields the same ordinal ranking—2024 > 2023 > 2021 > 2022—and 10,000 Monte Carlo replications, with uncertainty injected at the reconstructed-aggregate level, confirm that the extreme ranks are invariant across all runs. The contribution is methodological and practical: a transparent, low-infrastructure pipeline that brings first-pass financial screening within reach of SMEs operating under severe data and budget constraints.