DOI: 10.1017/pds.2026.10590 ISSN: 2732-527X

Entity matching for recurring engineering components: a bottom-up enabler for reference architecture reconstruction

Fabian Hanke, Markus Feld, Oliver von Heißen, Kenneth Tagscherer, Fabian Wyrwich, Malte Trienens, Aschot Hovemann, Roman Dumitrescu

ABSTRACT:

Engineering organisations increasingly aim to reuse historical BOM, CAD, and requirements data to identify recurring components. A key prerequisite is Entity Matching (EM), whose performance on heterogeneous engineering data is unclear. This paper evaluates classical models, zero-shot LLMs, and hybrid EM on Amazon–Google and a multimodal engineering dataset. Random Forest and XGBoost achieve near–state-of-the-art results; LLMs perform well but are costly, hybrids add little. EM transfers under controlled conditions and forms a foundation for reference architecture reconstruction.

More from our Archive