Abstract The objective is to develop an AI-powered system that precisely extracts information from mechanical equipment specification documents following API 610 standards, utilizing Large Language Model (LLM) and optical character recognition (OCR), agentic workflows and knowledge graphs to automate time and resource intensive manual process of specification digitization. Equipment datasheets, although following standards, exhibit significant vendor-dependent variability, particularly API 610 forms which contain complex structures with key-value pairs, checkboxes, and radio buttons. Traditional OCR methods fail to generalize across these variations, resulting in unacceptable errors for SAP integration and field QA/QC processes. A more sophisticated approach is required to ensure accuracy while adapting to different vendor formats. The system integrates custom layout detection model finetuned for these documents to identify and mask irrelevant information. Extracted data is chunked and processed through LLMs to identify subject-predicate-object relationships. These elements form nodes and relationships within a knowledge graph, which is queried alongside contextual information to extract specific insights based on engineering requirements and industry standards. Custom model development enables adaptation to new specification document types with minimal number of examples (20 labeled samples), significantly reducing time to extend application across various business units that potentially have different vendors. Introduction of LLM based question/answering system that is grounded on marked up specification document, improves performance metrics (94.2% extraction accuracy compared to 62.3% using traditional OCR techniques). This enabled savings of approximately 10,000+ manual labor hours previously of data sheet extraction and validation. The system architecture features two complementary components: an extraction agent that processes forms and builds knowledge graphs, and an intelligent decision agent that evaluates extracted data against manuals and engineering specifications. This multi-agent approach enables both accurate information extraction and automated compliance verification. Testing across five distinct form formats (20forms each) confirmed robust performance despite significant structural variations. The system successfully adapted to extract information from instrumentation and control equipment documentation with completely different formats after minimal retraining. The knowledge graph generated as a by-product serves as a valuable input for multiple downstream processes, creating a unified data foundation for maintenance operations. We achieve this by fine-tuning a LayoutLMv3-based, layout-aware OCR head on a small, curated set of vendor examples (∼20 pages per layout family), enabling robust detection of tables, key/value panels, and form elements (checkboxes/radios) and mapping tokens to schema fields before LLM parsing. This innovation addresses a critical industry challenge: extracting structured information from variable-format maintenance documents containing complex elements such as radio buttons, checkboxes, text fields, tables and multiple columns and rows. The system's 94.2% accuracy, adaptability, and integration capabilities transform previously manual, error-prone processes into automated, reliable workflows. Beyond immediate extraction benefits, the knowledge graph architecture enables integration with broader maintenance systems, supporting predictive maintenance, regulatory compliance, and intelligent decision-making. This represents a significant advancement in maintenance digitalization, providing a scalable foundation for AI-assisted maintenance operations while preserving critical engineering knowledge.
Yogidakshan et al. (Mon,) studied this question.
Synapse has enriched 4 closely related papers on similar clinical questions. Consider them for comparative context: