数据与计算发展前沿 ›› 2022, Vol. 4 ›› Issue (2): 3-16.

doi: 10.11871/jfdc.issn.2096-742X.2022.02.001

• 专刊:先进智能计算平台及应用 • 上一篇    下一篇

TensorFlow框架中OpenCL算子的实现及集成

郭强(),程大果(),孙羽菲*(),周建宇(),张玉志(),裴嘉傲(),甘润东(),陈锐()   

  1. 南开大学,软件学院,天津 300450
  • 收稿日期:2022-02-23 出版日期:2022-04-20 发布日期:2022-04-30
  • 通讯作者: 孙羽菲
  • 作者简介:郭强, 南开大学,软件学院,硕士研究生,主要研究方向为深度学习与高性能计算。
    本文中负责算子集成与模型相关实验,以及引言和背景介绍等部分的撰写。
    GUO Qiang is a master’s student in the College of Software at Nankai University. His research interests include Deep Learning and High-Performance Computing.
    In this paper, he is responsible for the parts of the integration of OpenCL operators, model-related experiment, abstract and introduction, etc.
    E-mail: guoqiang701@mail.nankai.edu.cn|程大果,南开大学,软件学院,硕士研究生,研究方向为深度学习和高性能计算。
    本文中负责部分OpenCL算子的实现及集成、实验分析等的撰写。
    CHENG Daguo is a master’s student in the College of Software at Nankai University. His research interests include Deep Learning and High-Performance Com-puting.
    In this paper, he is responsible for the parts of the implemen-tation and integration of OpenCL operators, experimental an-alysis, etc.
    E-mail: chengdaguo@mail.nankai.edu.cn|孙羽菲,南开大学,软件学院,特聘研究员,博士,主要研究方向为深度学习、异构计算、人工智能等。本文中负责论文整体设计,修改和指导。
    SUN Yufei, Ph.D, is a professor in the College of Software, Nankai University. Her research interests include Deep Learning, Heterogeneous Computing,Artificial Intelligence, etc.
    In this paper, she is responsible for overall design, revision and guidance of this paper.
    E-mail: yufei_sun@sina.com|周建宇,南开大学软件学院讲师,博士,主要研究方向为算法设计与优化、统计机器学习等。
    本文中主要负责论文修改与指导。
    ZHOU Jianyu, Ph.D, is a lecturer in the College of Software, Nankai Universi-ty. His research interests include algorithm design and optim-ization, statistical machine learning, etc.
    In this paper, he is responsible for the revision and guidance of this paper.
    E-mail: jyzhou@nankai.edu.cn|张玉志,南开大学,讲席教授,软件学院院长,主要研究方向为人工智能、模式识别、自然语言处理等。
    本文中主要负责论文的整体设计。
    ZHANG Yuzhi is the chair professor and the Dean of the College of Soft-ware at Nankai University. His research interests include Arti-ficial Intelligence, Pattern Recognition, Natural Language Processing, etc.
    In this paper, he is responsible for the overall design of this paper.
    E-mail: zyz@nankai.edu.cn|裴嘉傲,南开大学,软件学院,硕士研究生,目前研究方向为软件移植、企业区块链的应用等。
    在本文中主要工作是负责实验部分的撰写。
    PEI Jiaao is a master’s student in the College of Software, Nankai University. His research interests include software porting, application of enterprise blockchain, etc.
    In this paper, he is main responsible for writing the exper-imental part.
    E-mail: peijiaao@mail.nankai.edu.cn|甘润东,南开大学,软件学院,硕士研究生,主要研究方向为基于深度学习的场景识别。
    在本文中主要工作是负责基于OpenCL并行计算框架的核函数以及算子构成部分的撰写。
    GAN Rundong is a master student in the College of Software, Nankai University. The main research direction is scene recog-nition based on deep learning.
    In this paper, he is responsible for the preparation of kernel functions and operator components based on the OpenCL parallel computing framework.
    E-mail: raineast666@163.com|陈锐,南开大学,软件学院,博士研究生,主要研究领域为深度学习与高性能计算等。
    本文中主要负责实验部分的设计。
    CHEN Rui is a Ph.D student in the Coll-ege of Software, Nankai University. His research interests include Deep Learning and High Performance Computing.
    He completes the part of experiments design.
    E-mail: rzchen@mail.nankai.edu.cn
  • 基金资助:
    国家重点研发计划(2021YFB0300104)

Implementation and Integration of OpenCL Operators in TensorFlow Framework

GUO Qiang(),CHENG Daguo(),SUN Yufei*(),ZHOU Jianyu(),ZHANG Yuzhi(),PEI Jiaao(),GAN Rundong(),CHEN Rui()   

  1. College of Software, Nankai University, Tianjin 300450, China
  • Received:2022-02-23 Online:2022-04-20 Published:2022-04-30
  • Contact: SUN Yufei

摘要:

【目的】目前,TensorFlow 这一主流机器学习框架与CUDA异构编程环境的组合在学术界与工业界得到大量使用,使用CUDA实现的TensorFlow算子是加速计算的关键。然而,TensorFlow对于OpenCL 这一开放通用的异构编程标准的不支持严重限制了TensorFlow的通用性,并导致OpenCL硬件设备的算力无法充分发挥。【方法】针对此问题,本文深入探索TensorFlow的底层实现,在对TensorFlow代码结构深入分析的基础上实现了OpenCL算子,并且在2.2.0版本的TensorFlow框架实现了OpenCL算子的集成。【结果】基于上述实现, TensorFlow能够借助OpenCL算子在支持OpenCL 1.2的硬件设备上运行。同时,本文提出的优化方法也大幅提升了OpenCL算子的计算效率。【结论】通过实验表明,本文提出的方法能够有效地解决TensorFlow无法应用在OpenCL硬件设备上的问题。

关键词: TensorFlow, OpenCL, 算子

Abstract:

[Objective] TensorFlow, a mainstream machine learning framework, and CUDA heterogeneous programming environment are currently being used widely in academia and industry. TensorFlow operators implemented in CUDA are the key to accelerating computation. However, TensorFlow's lack of support for OpenCL, an open general-purpose heterogeneous programming standard, severely limits the versatility of TensorFlow and prevents the full computational power of OpenCL hardware devices. [Methods] To address this issue, this paper deeply explores the implementation of TensorFlow, implements the OpenCL operator based on an in-depth analysis of the TensorFlow code structure, and implements the integration of the OpenCL operator in the 2.2.0 version of the TensorFlow framework. [Results] Based on the above implementation, TensorFlow can run on hardware devices supporting OpenCL 1.2 with the help of the OpenCL operator. Also, the optimization method proposed in this paper significantly improves the computational efficiency of the OpenCL operator. [Conclusions] The experiments show that the method proposed in this paper can effectively solve the problem that TensorFlow cannot be applied to OpenCL hardware devices.

Key words: TensorFlow, OpenCL, Operator