Frontiers of Data and Computing ›› 2025, Vol. 7 ›› Issue (4): 67-78.

CSTR: 32002.14.jfdc.CN10-1649/TP.2025.04.006

doi: 10.11871/jfdc.issn.2096-742X.2025.04.006

• Special Issue: Artificially Intelligent Models and Tools for Space Science Big Data • Previous Articles     Next Articles

Research on Construction of a Semantic Association-Driven Space Science Data Repository System and Dataset Association Recommendation

WU Zhaochen(),LU Changfa*(),LI Gang,LAN Chenyang,WANG Cifeng   

  1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
  • Received:2025-04-29 Online:2025-08-20 Published:2025-08-21
  • Contact: LU Changfa E-mail:zcwu@cnic.cn;luchangfa@cnic.cn

Abstract:

[Background] With the exponential growth of multimodal data in space science, existing data management systems face significant challenges. The lack of semantic correlations between data in traditional architectures severely limits the efficiency of interdisciplinary knowledge discovery. [Objective] This study aims to construct a semantically enhanced space science data repository system, deeply exploring metadata semantics and their correlations across multi-source data to break disciplinary barriers and enhance correlation analysis capabilities. [Methods] The research constructs a metadata semantics network for space science data through a progressive three-tier conceptual-logical-physical architecture. Employing a non-intrusive data integration methodology, we develop key components including archival repository interface services, unified external service APIs, graph database management systems, and graph query engines, thereby establishing the space science data repository system without modifying existing business architectures. Furthermore, we design a metadata-driven semantic similarity calculation algorithm to quantify the association strength between datasets, with technical validation conducted through related datasets recommendation experiments. [Conclusions] Experiments show that the proposed method effectively improves knowledge discovery efficiency in space science, offers a novel solution to multimodal data fusion challenges, and significantly enhances capabilities for analyzing complex scientific data.

Key words: space science metadata semantics network, space science data repository system, attribute graph model, semantic similarity computation, associated datasets recommendation