数据与计算发展前沿 ›› 2021, Vol. 3 ›› Issue (4): 3-17.

doi: 10.11871/jfdc.issn.2096-742X.2021.04.001

• 可视化与可视分析专题 • 上一篇    下一篇

面向SWF日志事件流数据的可视分析系统

李玥1(),杨波1(),芦旭熠1,2(),单桂华1,*()   

  1. 1.中国科学院计算机网络信息中心,北京 100190
    2.中国科学院大学,北京 100049
  • 收稿日期:2021-06-22 出版日期:2021-08-20 发布日期:2021-08-30
  • 通讯作者: 单桂华
  • 作者简介:李玥,中国科学院计算机网络信息中心,助理工程师,主要研究方向为数据可视化与可视分析。
    本文承担工作为:可视化开发与核心算法实现。
    LI Yue is an assistant engineer in Com-puter Network Information Center, Chinese Academy of Sciences. His main research interests are visualization and visual analysis.
    In this paper, he undertakes the following tasks: visualization development and key algorithms implementation.
    E-mail: li1023@cnic.cn|杨波,中国科学院计算机网络信息中心,助理研究员,主要研究方向为可视化与可视分析。
    本文承担工作为:任务及需求分析、系统框架设计。
    YANG Bo is an assistant professor in Computer Network Information Center, Chinese Academy of Sciences. His main research interests are visualization and visual analysis.
    In this paper, he undertakes the following tasks: task and de-mand analysis, system framework implementation.
    E-mail: yangbo@cnic.cn|芦旭熠,中国科学院计算机网络信息中心,硕士研究生,主要研究方向为科学可视化、可视分析。
    本文承担工作为:技术调研及相关部分撰写。
    LU Xuyi is a postgraduate student in Computer Network Information Center, Chinese Academy of Sciences. Her main research interests include scientific visualization and visual analysis.
    In this paper, she undertakes the following tasks: the technical investigation and paper writing.
    E-mail: luxuyi@cnic.cn|单桂华,中国科学院计算机网络信息中心,研究员,主要研究方向为可视化与可视分析、智能交互。
    本文承担工作为:架构设计、研究指导。
    SHAN Guihua is a professor in Compu-ter Network Information Center, Chinese Academy of Sciences. Her main research interests are visualization and visual analysis, intelligent interaction.
    In this paper, she undertakes the following tasks: the structure design and research guidance.
    E-mail: sgh@cnic.cn
  • 基金资助:
    战略性先导科技专项(A类XDA19080102)

A Visual Analysis System for SWF Log Event Stream Data

LI Yue1(),YANG Bo1(),LU Xuyi1,2(),SHAN Guihua1,*()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-06-22 Online:2021-08-20 Published:2021-08-30
  • Contact: SHAN Guihua

摘要:

【目的】为解决高性能计算集群的日志数据交互可视分析的问题,提供一套支持多种不同集群系统日志数据的可视分析框架,本文设计并实现了面向SWF日志数据的事件流可视分析系统SWFVis。【方法】SWFVis基于时间事件序列数据可视分析技术,结合多维可视分析、关联可视分析等技术,提出了一套针对SWF日志中的时序属性、关联属性、多维属性进行综合可视分析的方法。【结果】针对公开的iPSC/860与ForHLR II集群产生的真实数据,SWFVis直观展示了日志中整体作业的并发状态、执行情况,并对集群中存在的作业处理模式进行分析。【局限】对于事件流可视化呈现效果,在三维场景中存在的视角遮挡、分布杂乱的现象,未来考虑通过层次聚类等方法实现进一步的优化。【结论】SWFVis提供了针对SWF日志数据的交互式可视分析方法,支持多种计算机集群使用。通过对作业间存在的时序性、关联性进行展示,以及对作业多维属性进行挖掘,可以直观呈现集群工作状态、用户提交行为,并交互式发掘集群作业中存在的处理模式,为集群调度优化提供支撑。

关键词: 高性能计算集群, 时间事件序列, 事件流, 可视分析, SWF日志

Abstract:

[Objective] HPC cluster log data is typically presented by traditional statistical charts, which has the disadvantage of no intuitive presentation and no interactivity. We design and implement SWFVis, an event stream visual analysis system for SWF log data to improve the presentation of the log data. [Methods] Based on the visual analysis technology of temporal event sequence data, combined with multi-dimensional visual analysis and associated visual analysis technology, SWFVis proposes a comprehensive visual analysis method for the sequential and multidimensional attributes of the jobs in the log event stream. [Results] Based on the real data generated by the public IPSC /860 and FORHLR II clusters, SWFVis intuitively displays the concurrent status and execution of the overall job in the logs, and analyzes the job processing patterns in the clusters. [Limitations] As for the visual rendering effect of event stream, there are still some problems in the three-dimensional scenes such as view blocking and disordered distribution. In the future, we consider achieving hierarchical clustering and other methods for further optimization. [Conclusions] SWFVis provides an interactive visual analysis method for SWF log data. It intuitively presents the working state of the cluster and user submission behavior, and supports interactive exploration of the processing pattern in the cluster operation, which provides support for the optimization of cluster scheduling.

Key words: HPC cluster, temporal event sequence data, event stream, visual analysis, SWF log