February 14, 2026Open Access

Optimised Data Integration using Transformer Model and Resource Description Framework

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Organizations have become highly reliant on a range of data sources that span structured, semi-structured, and unstructured data types. These repositories allow large-scale storage for faster ingestion and analytics but pose tremendous challenges of integration owing to schema and contextual differences. Traditional data integration methods, such as the ontology-based Resource Description Framework (RDF), are often inadequate when dealing with these challenges. They specifically struggle with the dynamic evolution of the schema of data sources, context-aware interpretation, and achieving interoperability across heterogeneous data sources. This paper presents an integrated system that augments resource description knowledge with token embeddings using the attention mechanism of the transformer model with relative positional encoding to overcome these weaknesses. Data from unstructured sources are used to create an embedding, whereas structured data are mapped into the RDF. The embeddings were then integrated into the RDF using hasEmbedding. Virtual transformations are employed to handle schema alignment and cosine similarity merges similar entities to provide a unified data view. Thus, the model explicitly integrates contextual knowledge within resource description knowledge triples, thereby improving the semantic representation. The proposed system uses a Simple Protocol and Resource Description Knowledge Query Language for the efficient querying of resource description knowledge, thus enhancing interoperability across domains. The proposed model produces a result that attains a good schema mapping accuracy of 97.82%, thus enabling more accurate and meaningful linking of heterogeneous datasets. Empirical trials involving use cases across human activity analysis and flood risk management prove the system’s robustness, scalability, and effectiveness for knowledge discovery while allowing cross-domain integration of heterogeneous types of data within intricate scenarios. The results show that incorporating embedding into RDF reduces dependence on strict, pre-defined ontologies, simplifies schema on-demand alignment, and allows unified querying without the need to curate the integrated data into a traditional data warehouse.

Me gusta

Guardar

Ver artículo completo