数据与计算发展前沿 ›› 2019, Vol. 1 ›› Issue (1): 105-115.

doi: 10.11871/jfdc.issn.2096.742X.2019.01.011

所属专题: “数据与计算平台”专刊

• • 上一篇    下一篇

飞桨:源于产业实践的开源深度学习平台

马艳军,于佃海,吴甜,王海峰   

  1. 百度公司,北京 100085
  • 收稿日期:2019-08-15 出版日期:2019-01-20 发布日期:2019-10-09
  • 作者简介:马艳军,1981年生,博士,百度深度学习技术平台部总监,负责深度学习平台飞桨(PaddlePaddle)的产品和研发工作,曾在ACL等自然语言处理顶级会议期刊发表论文数20余篇,并多次担任国际会议的Area Chair等。2015年,相关成果曾获国家科技进步二等奖。
    马艳军参与文章整体框架设计并完成论文写作。于佃海承担了文章第二部分的设计与写作。吴甜参与了文章第一和第四部分的设计和写作。王海峰对文章整体框架进行了设计,并参与第三部分的写作。
    Dr. Ma Yanjun is a director of deep learning platform at Baidu, overseeing the development of deep learning framework PaddlePaddle. He was born in 1981 in China. He has authored and co-authored over 20 research publications in Natural Language Processing, and served as area co-chairs for a number of top international conferences. In 2015, He received National Technology Advancement Award.
    As to this paper, he contributed to the organization of the paper and wrote the manuscript. Yu Dianhai contributed to the design and writing of Section 2. Wu Tian contributed to Section 1 and 4.Wang Haifeng oversaw the overall design and contributed to Section 3.
    E-mail: mayanjun02@baidu.com

PaddlePaddle: An Open-Source Deep Learning Platform from Industrial Practice

Yanjun Ma,Dianhai Yu,Tian Wu,Haifeng Wang   

  1. Baidu Inc.,Beijing 100085,China
  • Received:2019-08-15 Online:2019-01-20 Published:2019-10-09

摘要:

【目的】深度学习是近年来人工智能取得突破的驱动性核心技术,深度学习框架也被称作智能时代的操作系统,本文对国内唯一功能完备的开源深度学习平台飞桨(PaddlePaddle)进行了系统性介绍。【方法】首先介绍深度学习框架的发展历程,并概述飞桨深度学习平台的技术全景和生态建设进展,然后详细介绍飞桨核心框架的关键技术,包括前端语言、组网编程范式、核心架构图、算子库以及高效率计算核心五个部分。【结果】飞桨经过多年来产业实践中持续迭代创新,已经在超大规模分布式训练、多端高速推理等方面形成了独特的优势。【结论】系统性总结飞桨的主要创新点并对未来发展趋势进行展望。

关键词: 飞桨, 人工智能, 深度学习, 深度学习框架

Abstract:

[Objective] Deep learning is widely recognized as core technology driving the breakthroughs in artificial intelligence. Deep learning frameworks can be considered as the operating system in the era of artificial intelligence. PaddlePaddle, as the only fully-functioning open-source deep learning platform in China, is introduced comprehensively. [Methods] In this paper, a brief history of the deep learning frameworks is introduced, followed by an overview of PaddlePaddle, which is comprised of the core framework, toolkits and service platforms. After that, we elaborate on the core technologies of PaddlePaddle, including the front-end programming language, the modeling paradigm etc. Finally, the main innovations in PaddlePaddle are summarized. [Results] PaddlePaddle has been intensively tested in Baidu production for years, with unique features in supporting distributed training with ultra-large data and fast inference on server, mobile as well as edges. [Conclusions] The main innovations, research and development trends are discussed systematically.

Key words: PaddlePaddle, artificial intelligence, deep learning, deep learning framework