数据与计算发展前沿 ›› 2024, Vol. 6 ›› Issue (4): 34-45.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.04.003

doi: 10.11871/jfdc.issn.2096-742X.2024.04.003

• 专刊:面向国家科学数据中心的基础软件栈及系统 • 上一篇    下一篇

跨节点RDF关联路径检索技术研究与实现

刘峰(),韩芳*(),夏景隆,陈锟,魏天珂,高帅   

  1. 中国科学院计算机网络信息中心,北京 100083
  • 收稿日期:2024-03-27 出版日期:2024-08-20 发布日期:2024-08-20
  • 通讯作者: *韩芳(E-mail: hanfang@cnic.cn
  • 作者简介:刘峰,中国科学院计算机网络信息中心,博士,项目研究员,长期从事科学数据管理与共享服务技术研究及平台建设。主要研究方向为数据融合管理与语义关联技术。
    本文中负责文章框架组织和重点内容修订。
    LIU Feng is a project researcher at the Computer Network Information Center of the Chinese Academy of Sciences. He has long been engaged in scientific data management, sharing service technology research and platform construction. His main research directions are data fusion management and semantic association technology.
    In this paper, he is mainly responsible for the organization of the article framework and the revision of key content.
    E-mail: liufeng@cnic.cn|韩芳,中国科学院计算机网络信息中心,硕士,工程师,长期从事科学数据资源的管理汇聚、共享发布、语义关联技术研究与学科领域应用服务。主要研究方向为数据融合管理与语义关联技术。
    本文中负责总体统稿,第一、四章节的撰写,第三章节修订,负责关键技术框架原型设计。
    HAN Fang, holding a master’s degree, is an engineer at the Computer Network Information Center of the Chinese Academy of Sciences. She has long been engaged in scientific data management, sharing, semantic association technology research and application services in the scientific field. Her main research interests are data fusion management and semantic association technology.
    In this paper, she is responsible for the overall draft, the first and fourth chapters of the article, revising of the third chapter, and the prototyping of key technology frameworks.
    E-mail: hanfang@cnic.cn
  • 基金资助:
    国家重点研发计划“政府间国际科技创新合作专项”(2021YFE0117000);国家重点研发计划“面向国家科学数据中心的基础软件栈及系统”(2021YFF0704200);中国科学院“十四五”网信专项工程建设项目“科学大数据工程(三期)”(CAS-WX2022GC-02)

Design and Implementation of Cross-Endpoint Association Path Retrieval Technique in RDF Data

LIU Feng(),HAN Fang*(),XIA Jinglong,CHEN Kun,WEI Tianke,GAO Shuai   

  1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
  • Received:2024-03-27 Online:2024-08-20 Published:2024-08-20

摘要:

【目的】 跨节点关联路径检索是实现大规模分布式场景下科学数据关联发现的重要手段,如何突破多节点多跳查询的效率和准确性是一个关键技术难题,相关解决方案和技术具有重要而广泛的应用前景。【方法】 本文提出了以RDF类关系为驱动的跨节点关联路径检索技术,该技术以分布式节点RDF类关联关系构建为基础,将跨节点数据实体关联检索映射为RDF类关联检索,进而以RDF类关联关系为基础,指导SPARQL联邦查询语句的动态封装,实现关联数据跨节点检索。【结果】 通过测试验证,本文技术方案能有效提升跨节点RDF关联路径检索的效率和质量,可以支持多数据源节点,任意关联方向,多跳的动态查询。【结论】 基于RDF类关系驱动的跨节点关联路径检索技术,为解决分布式环境下的数据联合查询提供了一种高效且准确的解决方案,有望在复杂网络环境及大数据应用场景中发挥重要作用。

关键词: RDF, 科学关联数据, 语义关联发现, 多跳查询, 跨节点

Abstract:

[Objective] Cross-endpoint association path retrieval is a crucial method for discovering scientific data association in large-scale distributed scenarios. However, the efficiency and accuracy of multi-endpoint and multi-hop queries pose significant technical challenges. The solutions and technologies addressing these challenges have broad and important application prospects. [Methods] To tackle these issues, this article presents a cross-endpoint association path retrieval technique driven by RDF class relationships. This technique constructs distributed endpoint RDF class association relationships, which map cross-endpoint data entity association discovery to RDF class association discovery. Utilizing RDF class association relationships enables dynamic encapsulation of SPARQL federated query statements and facilitates cross-endpoint discovery of association data. [Results] Through testing and verification, this technique has shown effectively the higher efficiency and quality of cross-RDF data endpoint association path retrieval, supporting dynamic queries with multiple data source endpoints, any association direction, and multiple hops.[Conclusions] The cross-endpoint association path retrieval technique driven by RDF class relationships offers an efficient and accurate solution for joint data querying in distributed environments, which is expected to play a significant role in complex network settings and big data applications.

Key words: RDF, linked data, association path retrieval, multi-hop query, cross-endpoint