数据与计算发展前沿 ›› 2026, Vol. 8 ›› Issue (1): 64-76.

CSTR: 32002.14.jfdc.CN10-1649/TP.2026.01.006

doi: 10.11871/jfdc.issn.2096-742X.2026.01.006

• 技术与应用 • 上一篇    下一篇

基于检索增强的日志问答系统

武智晖1(),黄绍晗2,*(),张逸飞1,齐家兴2,肖智文1,曾畅2,栾钟治2   

  1. 1.中移动信息技术有限公司,大数据BG, 北京 100049
    2.北京航空航天大学,中德联合软件研究所, 北京 100083
  • 收稿日期:2025-03-10 出版日期:2026-02-20 发布日期:2026-02-02
  • 通讯作者: 黄绍晗
  • 作者简介:武智晖,软件工程硕士,主要研究方向为分布式系统、大数据平台、数据库。
    本文负责设计实验框架,完成了大部分数据收集和分析工作。
    WU Zhihui received a master’s degree in software engineering. His current research interests include distributed system, big data platform, and database.
    In this paper, he is responsible for designing the experimental framework and conducting most of the data collection and analysis.
    Email: wuzhihui@chinamobile.com|黄绍晗,北京航空航天大学,硕士,主要研究方向为日志分析,大数据,自然语言处理。
    本文负责撰写论文初稿的关键部分,开发了研究中使用的关键算法。
    HUANG Shaohan, holding a master’s degree, is with Beihang University. His current research interests include log analysis,big data and natural language processing.
    In this paper, he is responsible for drafting key sections of the manuscript and developing the critical algorithms used in the research.
    Email: huangshaohan@buaa.edu.cn
  • 基金资助:
    国家重点研发计划资助(2023YFB4503100);国家自然科学基金资助项目(U23B2027);中国移动联创+项目(CMITYD-202300415)

Retrieval-Enhanced Log Question Answering System

WU Zhihui1(),HUANG Shaohan2,*(),ZHANG Yifei1,QI Jiaxing2,XIAO Zhiwen1,ZENG Chang2,LUAN Zhongzhi2   

  1. 1. Department of Big Data, China Mobile Information Technology Center, Beijing 100049, China
    2. Sino-German Joint Software Institute, Beihang University, Beijing 100083, China
  • Received:2025-03-10 Online:2026-02-20 Published:2026-02-02
  • Contact: HUANG Shaohan

摘要:

【目的】在智能运维(AIOps)领域,日志问答是支持团队和系统管理员高效定位和解决系统问题的重要任务。然而,现有大语言模型在日志问答中的应用面临训练语料与日志内容之间的差异性,以及问答所需的日志上下文检索准确性不足等挑战。本研究旨在提出一种新方法,提升日志问答系统的性能与泛化能力。【文献范围】文章重点调研智能运维领域中日志问答任务的研究现状,重点分析了当前大语言模型在处理系统日志方面的局限性。【方法】本文提出了一种基于检索增强的日志问答系统名为LogMind,采用迭代反馈机制联合训练检索模型与大语言模型,同时设计了一种稳定的训练策略。【结果】在6个领域的16个系统日志数据集上进行了实验,结果表明LogMind框架显著提升了检索模型与大语言模型的准确性,同时展现出较强的跨模型泛化能力。同时,本文还分析了DeepSeek推理模型在日志问答场景下的效果,展示了推理模型在问答场景下的优势。【局限】本研究主要在离线场景中评估了方法的性能,未来需进一步探索实际生产环境中的实时响应能力与系统扩展性。【结论】LogMind框架为智能运维提供了一种可靠且智能的日志问答解决方案,为高级系统管理提供了重要支持,同时为日志问答任务的研究与应用提供了新的思路。

关键词: 智能运维, 日志问答, 日志检索, 大语言模型, 问答系统

Abstract:

[Objective] In the field of AI for IT Operations (AIOps), log question answering is a critical task that helps support teams and system administrators efficiently locate and resolve system issues. However, the application of large language models to log question answering faces challenges such as discrepancies between training corpora and log content, as well as insufficient accuracy in retrieving the contextual information required for answering questions. This study aims to propose a novel approach to improve the performance and generalization capability of log question answering systems. [Coverage] This article focuses on reviewing the current state of research on log question answering tasks in the AIOps domain, with an emphasis on analyzing the limitations of existing large language models in processing system logs. [Methods] This paper introduces a retrieval-enhanced log question answering system named LogMind. The system employs an iterative feedback mechanism to jointly train the retrieval model and the large language model, while also incorporating a robust training strategy. [Results] Experiments conducted on 16 system log datasets across 6 domains demonstrate that the LogMind framework significantly improves the accuracy of both the retrieval model and the large language model. Additionally, the framework exhibits strong cross-model generalization capabilities. [Limitations] This study primarily evaluates the proposed method in offline scenarios. Further exploration is needed to address real-time performance and scalability in production environments. [Conclusions] The LogMind framework provides a reliable and intelligent solution for log question answering in the AIOps domain, offering critical support for advanced system management. It also presents new perspectives for the research and application of log question answering tasks.

Key words: AIOps, log question answering, log retrieval, large language models, question answering