数据与计算发展前沿 ›› 2022, Vol. 4 ›› Issue (1): 5-19.

doi: 10.11871/jfdc.issn.2096-742X.2022.01.001

• 专刊:“国家科学数据中心联合”专刊 • 上一篇    下一篇

数据工程学建设思考与实践

张耀南1,2,3,*()   

  1. 1.国家冰川冻土沙漠科学数据中心,甘肃 兰州 730000
    2.中国科学院西北生态环境资源研究院,甘肃 兰州 730000
    3.甘肃省资源环境科学数据工程技术研究中心,甘肃 兰州 730000
  • 收稿日期:2021-09-20 出版日期:2022-02-20 发布日期:2022-03-04
  • 通讯作者: 张耀南
  • 作者简介:张耀南,中国科学院西北生态环境资源研究院,研究员,博士生导师,国家冰川冻土沙漠科学数据中心主任,主要研究方向为地学数据工程及数据工程防灾减灾、基于高性能计算环境的地学模型模拟、遥感图像处理及多源数据融合。
    ZHANG Yaonan, PH.D, is a professor in Northwest Institute of Eco-Environment and Resources and director of the National Cryosphere Desert Science Data Center. His main research interests include data engineering, disaster prevention and reduction with data engineering,integrated modeling, remote sensing image processing, and multi-source heterogeneous data fusion. E-mail: yaonan@lzb.ac.cn
  • 基金资助:
    中国科学院信息化项目“寒旱区环境研究科技领域云建设与应用”(XXH13506)

Data Engineering Discipline Construction and Practice

ZHANG Yaonan1,2,3,*()   

  1. 1. National Cryosphere Desert Scientific Data Center, Lanzhou, Gansu 730000, China
    2. Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou, Gansu 730000, China
    3. Gansu Data Engineering and Technology Research Center for Resource and Environment, Lanzhou, Gansu 730000, China
  • Received:2021-09-20 Online:2022-02-20 Published:2022-03-04
  • Contact: ZHANG Yaonan

摘要:

【目的】尽管数据科学已经可以处理大量的数据并解决了很多问题,正在改变着科研、企业运作和社会治理模式,但数据科学成果存在难以工程化的局限性,要将数据资产及其隐含价值有效转化为服务、决策、产品,形成数字经济,还需要建立数据工程学来支持对数据实施工程活动,实现数据驱动的数据价值转化,服务日常工作,形成数字经济。【方法】本文引入工程学思想,将伴随数据科学诞生的狭义数据工程推广为广义数据工程,论述了数据工程学建立的必要性,参考土木工程学科建设及工程学科应具备的特征,分析了基于数据物质基础的数据工程学知识特征,给出了数据工程学的概念、理论基础、研究内容、研究框架和主要技术体系,并通过两个数据工程应用案例说明建立数据工程这一新方法论的必要性。【结论】数据工程学具备了数据物质基础的独特知识体系,具备了综合数学、电子与信息、计算机、数据科学以及各领域学科的特殊研究方法,数据工程学建设的物质、理论、技术、需求等基础已经具备,建立数据工程学支持将数据资产转化为工程应用并形成数字经济非常迫切。

关键词: 狭义数据工程, 广义数据工程, 数据工程学, 数据科学, 数字经济

Abstract:

[Objective] While data science can handle a large amount of data and solve a lot of problems, it is changing the models of scientific research, enterprise operation, and social governance. Owing to the difficulty in data science engineering, it is necessary to establish a data engineering discipline to convert the data assets and their intrinsic value to effective services, decision making, and data products to enabledigital economy. [Methods] This paper introduces the idea of engineering, extends the concept of narrow data engineering to broad data engineering, discusses the necessity of establishing the discipline of data engineering, and analyzes the characteristics of the data engineering knowledge based on data material basis by referring to the characteristics of the civil engineering discipline and its construction. This paper presents the concept, theoretical basis, research content, research framework, and main technical system of the data engineering discipline, and illustrates the necessity of establishing a new methodology of data engineering through two data engineering application cases. [Conclusions] The data engineering discipline is of a unique knowledge system based on data matters and special research methods that integrate mathematics, electronics, information science, computer science, data science, and some other disciplines. The material, theoretical, technical, and demand basis for data engineering construction have been established. It is urgent to establish a data engineering support to transform data assets into engineering applications to enable the digital economy.

Key words: narrow data engineering, generalized data engineering, Data Engineering Discipline, data science, the digital economy