Frontiers of Data and Domputing ›› 2022, Vol. 4 ›› Issue (3): 46-65.

CSTR: 32002.14.jfdc.CN10-1649/TP.2022.03.004

doi: 10.11871/jfdc.issn.2096-742X.2022.03.004

• Special Issue: Advanced Intelligent Computing Platform and Application • Previous Articles     Next Articles

Evaluation of KPI Anomaly Detection Methods

SUN Yongqian1,2(),ZHANG Ruru1(),LIN Zihan1(),ZHANG Shenglin1,2,3,*(),TAN Zhiyuan1(),ZHANG Yuzhi1,2,3()   

  1. 1. College of Software, Nankai University, Tianjin 300350, China
    2. Tianjin Key Laboratory of Operating System, Tianjin 300350, China
    3. Haihe Laboratory of Information Technology Application Innovation, Tianjin 300350, China
  • Received:2022-02-14 Online:2022-06-20 Published:2022-06-20
  • Contact: ZHANG Shenglin E-mail:sunyongqian@nankai.edu.cn;1852917912@qq.com;2120210568@mail.nankai.edu.cn;zhangsl@nankai.edu.cn;bhbean42@qq.com;zyz@nankai.edu.cn

Abstract:

[Objective] As the basis of rapid fault discovery and repair, key performance indicator (KPI, such as page view count, page-view delay, server CPU utilization, router memory utilization, switch throughput, server disk I/O) anomaly detection is becoming more and more critical for the rapid development of cloud computing technology services. [Coverage] We extensively investigated the related works of KPI anomaly detection at home and abroad in recent years. [Methods] We conduct in-depth research and analysis on KPI anomaly detection methods at various development stages and select 13 representative methods for experimental evaluation. [Results] We summarize the general problems, challenges, and frameworks. We evaluate the performance of these methods using the KPI dataset collected from three top-tier Internet companies in terms of accuracy, robustness, and efficiency. [Conclusions] These methods cover statistics-based, supervised, semi-supervised and unsupervised methods with advantages and disadvantages. Our research and analysis provide a basis for future researchers to select the most appropriate KPI anomaly detection method quickly and accurately for their scenarios.

Key words: Key performance indicator, anomaly detection, method evaluation, machine learning