数据与计算发展前沿 ›› 2019, Vol. 1 ›› Issue (2): 86-97.doi: 10.11871/jfdc.issn.2096-742X.2019.02.008

所属专题: “人工智能”专刊

• 人工智能专刊 • 上一篇    下一篇

人工智能计算与数据服务平台的研究与应用

王彦棡1,2,王珏1,2,曹荣强1,2,*()   

  1. 1. 中国科学院计算机网络信息中心,人工智能技术与应用发展部,北京 100190
    2. 中国科学院大学,北京 100049
  • 收稿日期:2019-09-20 出版日期:2019-12-20 发布日期:2020-01-15
  • 通讯作者: 曹荣强 E-mail:caorq@sccas.cn
  • 作者简介:王彦棡,1978年生,中国科学院计算机网络信息中心,研究员,博士,主要研究方向为人工智能应用、高性能计算。
    本文承担工作为:平台的整体设计、研究指导。
    Wang Yangang, born in 1978, PHD, is a reasearch fellow at Computer Network Information Center of Chinese Academy of Sciences. His main research interests are artificial intelligence application and high performance computing.
    He undertakes the following tasks: the overall design and research guidance of the platform.
    E-mail: wangyg@sccas.cn|王珏,1981年生,中国科学院计算机网络信息中心,副研究员,博士,主要研究方向为人工智能算法与应用软件。
    本文承担工作为:应用和数据服务的研发,平台测试。
    Wang Jue, born in 1981, PHD, is an associate research fellow at Computer Network Information Center of Chinese Academy of Sciences. His main research interest is artificial intelligence algorithms and application software.
    E-mail: wangjue@sccas.cn.
    He undertakes the following tasks: design and implementation of data and AI application services, as well as platform testing.|曹荣强,1982年生,中国科学院计算机网络信息中心,副研究员,博士,主要研究方向为人工智能平台。
    本文承担工作为:WEB服务和资源服务的研发,平台测试。
    Cao Rongqiang, born in 1982, PHD, is an associate research fellow at Computer Network Information Center of Chinese Academy of Sciences. His main research interest is artificial intelligence platform.
    He undertakes the following tasks: design and implementation of WEB and resource services, as well as platform testing.
  • 基金资助:
    国家重点研发计划“大规模并行计算的工具库和领域相关基础软件包”(2017YFB0202202);国家自然科学基金委青年基金“基于云存储服务的高性能计算作业开放云服务关键技术研究”(61702476);北京市自然科学基金-海淀原始创新联合基金项目“面向深度学习的GPU虚拟化关键方法与技术研究”(L182053);国家电网有限公司总部科技项目"电力人工智能实验及公共服务平台技术"(SGGR0000JSJS1800569)

Research and Application of A Platform for Artificial Intelligence Computing and Data Services

Wang Yangang1,2,Wang Jue1,2,Cao Rongqiang1,2,*()   

  1. 1. Department of Artificial Intelligence Technology and Application Development, Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2019-09-20 Online:2019-12-20 Published:2020-01-15
  • Contact: Cao Rongqiang E-mail:caorq@sccas.cn

摘要:

【背景】人工智能技术的快速发展离不开大规模的计算资源和数据资源。海量计算能力是深度学习模型快速训练的有效保障,标准化的数据集是人工智能算法开展大规模训练和提升准确率的重要基础。【目的】人工智能计算与数据服务平台能够有效整合计算资源、数据资源和应用资源,支持模型设计、训练和推理的全过程。【方法】在人工智能和高性能计算融合发展的趋势下,本文重点讨论人工智能平台的应用场景、典型特征及非功能性需求,以中国科学院人工智能计算及数据应用服务平台为例,讨论如何实现平台的各项服务和功能。【结果】人工智能计算及数据应用服务平台综合利用WEB服务、命令行和在线调试工具等多种服务,能够以简单易用的方式支持人工智能算法的快速研发,支持海量大规模训练任务的快速运算,支持多种数据集的快速访问和传输。【结论】人工智能平台能够为用户提供简单易用的集成工作环境,从而促进人工智能技术在多个学科领域的发展。

关键词: 人工智能平台, 计算服务, 资源服务, 应用服务

Abstract:

[Background] The rapid development of artificial intelligence technology depends on large-scale computing and data resources. Massive computing capacity is an effective guarantee for fast training of deep learning models. Standardized data is an important basis for artificial intelligence algorithms to carry out training processes and accuracy improvements. [Objective]The artificial intelligence computing and data service platform can effectively integrate computing, data and software resources into a unified virtual infrastructure. The platform can provide an integrated working environment for researchers. Besides, it also supports full life-cycle of model design, training and inference. [Methods] Under the circumstance of the convergence of artificial intelligence and high performance computing, we discuss typical usage scenarios, key features, and non-functional requirements related to artificial intelligence platforms. The platform for Artificial Intelligence Computing and Data Application Services at the Chinese Academy of Sciences is introduced. Then the architecture and services of the platform are discussed to address the issues in design and implement of the platform. [Results] By means of Web Services, command lines and online debug tools, the platform can support the rapid creation of artificial intelligence models in an easy-to-use way, process massive training jobs, and enable data access and transfer. [Conclusions] The platform for artificial intelligence and data services can provide an easy-to-use integrated work environment, and further advance the development of artificial intelligence in multiple research areas and disciplines.

Key words: artificial Intelligence platform, computing service, resource service, application service