数据与计算发展前沿 ›› 2023, Vol. 5 ›› Issue (1): 28-40.

CSTR: 32002.14.jfdc.CN10-1649/TP.2023.01.003

doi: 10.11871/jfdc.issn.2096-742X.2023.01.003

• 专刊:科学数据资源、技术与政策联合专刊 • 上一篇    下一篇

科学数据语义关联技术研究与应用

刘峰1(),韩芳1,*(),魏天珂1,陈锟1,赵月红2,吴慧3,范国梅4   

  1. 1.中国科学院计算机网络信息中心,北京 100083
    2.中国科学院过程工程研究所,北京 100190
    3.中国科学院植物研究所,北京 100093
    4.中国科学院微生物研究所,北京 100101
  • 收稿日期:2023-01-20 出版日期:2023-02-20 发布日期:2023-02-20
  • 通讯作者: 韩芳
  • 作者简介:刘峰,中国科学院计算机网络信息中心,博士,项目研究员,长期从事科学数据管理与共享服务技术研究及平台建设。主要研究方向为数据融合管理与语义关联技术。
    本文中负责文章框架组织和重点内容修订。
    LIU Feng, Ph.D., is a project researcher at the Computer Net-work Information Center of Chinese Academy of Sciences. He has long been engaged in scientific data management, sharing service technology research, and platform construction. His main research directions are data fusion management and sem-antic association technology.
    In this paper, he is mainly responsible for the organization of the article framework and the revision of key content.
    E-mail: liufeng@cnic.cn|韩芳,中国科学院计算机网络信息中心,硕士,工程师,长期从事科学数据资源的管理汇聚、共享发布、语义关联技术研究与学科领域应用服务。主要研究方向为数据融合管理与语义关联技术。
    本文中负责总体统稿,第二、四章节的撰写,第三章节修订,负责关键技术框架原型设计。
    HAN Fang, a Master, is an engineer in the Computer Network Information Center of Chinese Academy of Sciences. She has long been engaged in scientific data management, sharing, semantic association technology research, and application services in the scientific field. Her main research interests are data fusion management and semantic association technology.
    She is responsible for the overall draft, the second and fourth chapters of the article, revising the third chapter, and key technical framework prototype design.
    E-mail: hanfang@cnic.cn
  • 基金资助:
    国家重点研发计划“面向国家科学数据中心的基础软件栈及系统”(2021YFF0704200);中国科学院“十四五”网信专项工程建设项目“科学大数据工程(三期)”(CAS-WX2022GC-02)

Research and Applications of Semantic Association for Scientific Data

LIU Feng1(),HAN Fang1,*(),WEI Tianke1,Chen Kun1,ZHAO Yuehong2,Wu Hui3,FAN Guomei4   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
    2. Institute of Process Engineering, Chinese Academy of Sciences, Beijing 100190, China
    3. Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
    4. Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
  • Received:2023-01-20 Online:2023-02-20 Published:2023-02-20
  • Contact: HAN Fang

摘要:

【目的】在数据密集型和融合科研新范式下,迫切需要传统的数据共享服务向数据知识化服务转化,使用语义关联技术在海量科学数据间进行知识组织、关联、发现正是解决这个问题的核心路径。【方法】本文对国内外数据关联技术总体现状、领域应用研究进展进行了广泛调研,对结构化数据关联发布、长文本语义挖掘和数据关联融合服务等关键技术进行了深入研究,初步实现了领域科学数据的关联化组织发布和语义化融合服务。【结果】在化学、植物及微生物数据中心展开应用实践,验证了科学数据语义关联融合技术是实现数据知识化服务的可行且重要手段。【结论】未来由各领域数据中心建立起来的科学数据关联融合网络将成为服务科研新范式需求的重要数据基础设施。

关键词: 科学数据, 数据关联, 语义关联, 融合服务

Abstract:

[Objective] Under the new paradigm of data-intensive and integrated scientific research, there is an urgent need to transform traditional data-sharing services into data knowledge-based services. Using semantic association technology to organize, associate, and discover knowledge among massive scientific data is the core path to solving this problem. [Methods] This paper conducts extensive research on the overall status of data association technology at home and abroad and the research progress in application fields. In-depth research has been conducted on key technologies such as structured data association and publishing, long-text semantic mining, and data association fusion. It has preliminarily realized the association, publishing, and semantic fusion of scientific data. [Results] The application practice in chemical, plant, and microbial data centers has verified that the semantic association and fusion technology of scientific data is a feasible and important mean to realize data knowledge-related services. [Conclusions] In the future, the scientific data association and fusion network established by Data Centers in various fields will become an important data infrastructure to serve the needs of the new paradigm of scientific research.

Key words: scientific data, data association, semantic association, fusion service