数据与计算发展前沿 ›› 2025, Vol. 7 ›› Issue (2): 40-48.

CSTR: 32002.14.jfdc.CN10-1649/TP.2025.02.005

doi: 10.11871/jfdc.issn.2096-742X.2025.02.005

• 专刊:中国科技云10周年 • 上一篇    下一篇

面向地球大数据的新型计算系统设计与实践

卢莎莎(),牛铁*(),吴璨,康乐,肖海力   

  1. 中国科学院计算机网络信息中心,北京 100083
  • 收稿日期:2025-03-12 出版日期:2025-04-20 发布日期:2025-04-23
  • 通讯作者: 牛铁
  • 作者简介:卢莎莎,中国科学院计算机网络信息中心,高性能计算技术与应用发展部,高级工程师,主要研究方向为高性能计算、软件持续交付技术、容器技术。
    本文中主要承担工作为论文撰写及HPC计算函数的应用。
    LU Shasha is a senior engineer at the Department of High Performance Computing Technology & Application Development, Computer Network Information Center, Chinese Academy of Sciences. Her research interests include high performance computing, continuous delivery and container technology.
    In this paper, she is responsible for the paper drafting and application of HPC functions.
    E-mail: lusha721@sccas.cn|牛铁,中国科学院计算机网络信息中心,科技云运行与技术发展部,高级工程师,主要研究方向为高性能计算、分布式计算、集群监控与分析。
    本文主要承担工作为平台架构设计、建设及论文撰写。
    NIU Tie is a senior engineer at the Department of Science and Technology Cloud Department, Computer Network Information Center, Chinese Academy of Sciences. His research interests include high performance computing, distributed computing, cluster monitoring and analysis technology.
    In this paper, he is responsible for design and implementation of the Big Earth Data Cloud Service Infrastructure, as well as paper drafting.
    E-mail: niut@sccas.cn
  • 基金资助:
    可持续发展大数据国际研究中心科研与运行

Design and Practice of a Novel Computing System for Big Earth Data

LU Shasha(),NIU Tie*(),WU Can,KANG Le,XIAO Haili   

  1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
  • Received:2025-03-12 Online:2025-04-20 Published:2025-04-23
  • Contact: NIU Tie

摘要:

【应用背景】地球大数据具有大规模、多样化、高复杂性和非结构化等特点,相关数据处理面临数据异构分散、计算复杂繁重、协同处理困难等挑战。【目的】提高海量异构地球大数据分析、处理、发布效率,加速大数据驱动科学创新。【方法】本文设计并实现了一种新型超融合架构计算系统,研发了资源聚合与作业调度、HPC计算函数等服务,实现了超级计算、云计算等多元算力在单一计算系统中的集成融合与数据共享。【结果】建成了地球大数据云服务基础平台,形成了“云+超算”协同计算服务能力,满足了科研人员按需构建个性化计算环境、利用大数据与超级计算等方法协同处理科研数据需求。【结论】地球大数据云服务基础平台实现了多元算力融合,减少了跨算力数据搬运,提高了协同计算效率,更好的满足了专项与SDGs(Sustainable Development Goals)评估中复杂应用场景的快速计算需求,采用的方法对研制以数据为中心、一站式处理的新型融合架构计算系统具有积极借鉴意义。

关键词: 高性能计算, 云计算, 超融合, 地球大数据

Abstract:

[Background] Big Earth data is characterized by its massive scale, diversity, high complexity, and unstructured nature. The data processing faces challenges including heterogeneous datasets, complex computing, and difficulties in collaborative processing. [Objective] This paper aims to improve the efficiency of analyzing, processing, and publishing heterogeneous big earth data, and accelerate scientific innovation. [Methods] This paper designs and implements a novel computing system with a hyper-converged architecture and develops services including computing resource aggregation, job scheduling and HPC functions. These services support integration of cloud computing and supercomputing within a single system. [Results] A Big Earth Data Cloud Service Infrastructure has been established, forming a service of both cloud computing and supercomputing. [Conclusions] The Big Earth Data Cloud Service Infrastructure integrates cloud computing and supercomputing, reduces data transfer between different computing resources, and enhances the efficiency. It meets the needs of SDGs (Sustainable Development Goals), and provides a valuable reference for developing other novel hyper-converged architectures.

Key words: high performance computing, cloud computing, hyper-convergence, big earth data