Leveraging Large Language Models and Object Detection for Automated Knowledge Graph Generation from Industrial Schematics
Federico Lopomo, Valentina Faraco, Davide Marche, Saverio Ieva, Giuseppe Loseto, Davide Loconte, Floriano Scioscia, Michele RutaIndustrial digitalization increasingly requires automated tools capable of extracting structured knowledge from complex engineering documentation, such as Piping and Instrumentation Diagrams (P&IDs). This work proposes an integrated framework that combines object detection and Large Language Models (LLMs) for automated Knowledge Graph (KG) generation. The approach enables the transformation of unstructured P&ID schematics into machine-interpretable representations, supporting data-driven analysis and decision-making. A modular pipeline is developed, including image pre-processing, symbol detection via a YOLO-based model, and identification of semantic relations between schematic elements using LLMs. The proposal also includes the definition of a reference ontology, which is exploited for the construction of the KG, and a diagram dataset designed to test the performance of the object detection model. The KG generation procedure achieves strong results in terms of image reconstruction across a wide set of industrial schematics, while also preserving the semantic integrity and completeness of the original diagrams. The proposed method represents a significant step toward the digitalization of industrial knowledge, bridging traditional engineering documentation and semantic-based technologies.