数据与计算发展前沿 ›› 2023, Vol. 5 ›› Issue (3): 66-91.

CSTR: 32002.14.jfdc.CN10-1649/TP.2023.03.006

doi: 10.11871/jfdc.issn.2096-742X.2023.03.006

• 专刊:“人工智能&大数据”科研范式变革专刊(下) • 上一篇    下一篇

数据密集型超算现状、挑战以及未来发展趋势

魏嘉1(),陈默2,王龙翔1,*(),任沛2,雷雨佳1,屈俞岐1,蒋骐羽1,董小社1,伍卫国1,张凯丽2,张兴军1   

  1. 1.西安交通大学,计算机学院,陕西 西安 710049
    2.华为技术有限公司,广东 深圳 518129
  • 收稿日期:2022-06-22 出版日期:2023-06-20 发布日期:2023-06-21
  • 通讯作者: *王龙翔(E-mail: wlx419@xjtu.edu.cn
  • 作者简介:魏嘉,西安交通大学,计算机科学与技术学院,博士研究生,主要研究方向为计算机体系结构、高性能计算、深度学习和可计算存储设备。
    本文负责论文初稿撰写与HPC性能评价模型设计。
    WEI Jia is a Ph.D. student at the School of Computer Science, Xi’an Jiaotong University. His main research interests are com-puter architecture, high-performance computing, deep learning, and computable storage devices.
    In this paper, he is responsible for writing the draft of the paper and designing the HPC performance evaluation model.
    E-mail: weijia4473@stu.xjtu.edu.cn|王龙翔,西安交通大学,计算机学院实验中心,副主任,博士,主要研究方向为海量存储系统、深度学习加速计算。
    本文负责论文第1章主流超算系统对数据密集型应用支持程度撰写。
    WANG Longxiang, Deputy Director of the Experimental Cen-ter of the School of Computer Science, Xi’an Jiaotong Unive-rsity, Ph.D., his main research directions are mass storage systems, deep learning accelerated computing.
    In this paper, he is responsible for writing the support level of mainstream supercomputing systems for data-intensive applications in Chapter 2.
    E-mail: wlx419@xjtu.edu.cn
  • 基金资助:
    国家重点研发计划(2016YFB0200902)

Status, Challenges, and Trends of Data-Intensive Supercomputing

WEI Jia1(),CHEN Mo2,WANG Longxiang1,*(),REN Pei2,LEI Yujia1,QU Yuqi1,JIANG Qiyu1,DONG Xiaoshe1,WU Weiguo1,ZHANG Kaili2,ZHANG Xingjun1   

  1. 1. School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China
    2. Huawei technologies Co. Ltd., ShenZhen, Guangdong 518129, China
  • Received:2022-06-22 Online:2023-06-20 Published:2023-06-21
  • Contact: *王龙翔(E-mail: wlx419@xjtu.edu.cn

摘要:

【目的】 对数据密集型超算的发展历程、主流系统结构、典型应用和计算、存储子系统进行较全面的梳理,指出未来发展趋势,为后续数据密集型超算系统优化提供参考。【方法】 本文首先梳理了数据密集型超算中的关键概念,分析阐述现有平台对数据密集型应用的支持程度。说明了科学界和工业界对数据密集型应用的实际需求情况。并对数据密集型超算的未来发展趋势和面临的潜在挑战进行展望,建立了超算系统评测模型。【结果】 相关研究人员和从业者可从本文快速了解到超算技术的关键概念及发展状况,精准捕捉当下与未来数据密集型超算研究热点和亟待解决的关键问题。【结论】 数据密集型超算存储系统面临的复杂数据类型优化、混合负载优化、多协议支持与互通等将会成为未来一段时间内研究和发展的热点问题。

关键词: 数据密集型超算, I/O密集型超算, 高性能数据分析, 并行处理系统, 超算存储系统

Abstract:

[Objective] This paper is to provide a comprehensive and systematic overview of the development history, mainstream system architecture, typical applications, and computation and storage subsystems of data-intensive supercomputing, point out the future development trend, and provide references for further data-intensive supercomputing optimization. [Methods] This paper first sorts out the key concepts of data-intensive supercomputing and analyzes the support to the data-intensive applications by existing platforms. Then the real demand for data-intensive applications from the mainstream academic and industrial communities are illustrated. Finally, the future trends and potential challenges of data-intensive supercomputing are discussed and a corresponding supercomputing system evaluation model is developed. [Results] Relevant researchers and practitioners can quickly understand the key concepts and development status of supercomputing technology from this paper, and precisely capture the current and future data-intensive supercomputing research hotspots and key problems that need to be solved. [Conclusions] The problems such as the optimization on complex data type and mixed workload, and multi-protocol support and interoperability which are faced by the data-intensive supercomputing storage systems will become hot research and development issues in the coming years.

Key words: data-intensive supercomputing, I/O intensive supercomputing, high performance data analytics, parallel processing system, supercomputing storage system