Frontiers of Data and Computing ›› 2021, Vol. 3 ›› Issue (4): 3-17.

doi: 10.11871/jfdc.issn.2096-742X.2021.04.001

• Special Issue: Visualization and Visual Analysis • Previous Articles     Next Articles

A Visual Analysis System for SWF Log Event Stream Data

LI Yue1(),YANG Bo1(),LU Xuyi1,2(),SHAN Guihua1,*()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-06-22 Online:2021-08-20 Published:2021-08-30
  • Contact: SHAN Guihua E-mail:li1023@cnic.cn;yangbo@cnic.cn;luxuyi@cnic.cn;sgh@cnic.cn

Abstract:

[Objective] HPC cluster log data is typically presented by traditional statistical charts, which has the disadvantage of no intuitive presentation and no interactivity. We design and implement SWFVis, an event stream visual analysis system for SWF log data to improve the presentation of the log data. [Methods] Based on the visual analysis technology of temporal event sequence data, combined with multi-dimensional visual analysis and associated visual analysis technology, SWFVis proposes a comprehensive visual analysis method for the sequential and multidimensional attributes of the jobs in the log event stream. [Results] Based on the real data generated by the public IPSC /860 and FORHLR II clusters, SWFVis intuitively displays the concurrent status and execution of the overall job in the logs, and analyzes the job processing patterns in the clusters. [Limitations] As for the visual rendering effect of event stream, there are still some problems in the three-dimensional scenes such as view blocking and disordered distribution. In the future, we consider achieving hierarchical clustering and other methods for further optimization. [Conclusions] SWFVis provides an interactive visual analysis method for SWF log data. It intuitively presents the working state of the cluster and user submission behavior, and supports interactive exploration of the processing pattern in the cluster operation, which provides support for the optimization of cluster scheduling.

Key words: HPC cluster, temporal event sequence data, event stream, visual analysis, SWF log