数据与计算发展前沿 ›› 2024, Vol. 6 ›› Issue (4): 116-127.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.04.010

doi: 10.11871/jfdc.issn.2096-742X.2024.04.010

• 专刊:面向国家科学数据中心的基础软件栈及系统 • 上一篇    下一篇

基于ChatGLM2-6B的农业政策问答系统

韦一金1,2(),樊景超1,2,*()   

  1. 1.中国农业科学院农业信息研究所,北京 100081
    2.国家农业科学数据中心,北京 100081
  • 收稿日期:2023-11-23 出版日期:2024-08-20 发布日期:2024-08-20
  • 通讯作者: *樊景超(E-mail: fanjingchao@caas.cn
  • 作者简介:韦一金,中国农业科学院农业信息研究所,硕士研究生,研究方向为农业政策与深度学习。
    本文中主要工作为提出研究思路,设计实验方案,构建系统,撰写论文。
    WEI Yijin is a master’s student at the Agriculture Information Institution of CAA. Her research interests include agricultural policy and deep learning.
    In this paper, she is responsible for proposing research ideas, designing experimental schemes, building systems, and writing the paper.
    E-mail: weiyijin0816@163.com|樊景超,中国农业科学院农业信息研究所,博士,副研究员,研究方向为农业大数据。
    本文中主要工作为设计研究方案,实验结果的解读与分析。
    FAN Jingchao, Ph.D., is an associate researcher at the Agriculture Information Institution of CAA. His research interests include agricultural Big Data.
    In this paper, he is responsible for designing the research scheme and interpreting and analyzing the experimental results.
    E-mail: fanjingchao@caas.cn
  • 基金资助:
    国家重点研发计划“面向融合科学场景的应用示范”(2021YFF0704204)

An Agricultural Policy Question Answering System Based on ChatGLM2-6B

WEI Yijin1,2(),FAN Jingchao1,2,*()   

  1. 1. Agriculture Information Institution of CAAS, Beijing 100081, China
    2. National Agriculture Science Data Center, Beijing 100081, China
  • Received:2023-11-23 Online:2024-08-20 Published:2024-08-20

摘要:

【目的】 为了提高政策的透明度、降低信息不对称,为利益相关者提供一个获得农业政策信息与指导的便捷途径,本文构建了结合ChatGLM2-6B和Langchain-Chatchat的农业政策问答系统。【方法】 通过爬虫获取国家乡村振兴局公示的农业政策全文和中央一号等指导性农业政策全文以及黄河九省乡村振兴局农业政策全文,构建农业政策问答数据集,利用该数据集对ChatGLM2-6B模型进行QLoRA微调及模型合并量化,然后将得到的ChatGLM2-6B-QLoRA-int4模型与Langchain-Chatchat及本地农业政策知识库结合构建农业政策问答系统。【结果】 对ChatGPT、ChatGLM2-6B、ChatGLM2-6B-QLoRA和本问答系统分别进行提问,对回答结果采用专家打分法进行评价,本系统在农业政策专业领域中回答评分优于ChatGLM2-6B、ChatGLM2-6B-QLoRA,综合效果而言优于ChatGPT。【结论】 本研究所构建问答系统在农业政策领域表现较好,能确保专有数据安全,可以实现基于LLM的问答系统本地部署。

关键词: 大语言模型(LLM), 农业, 政策, 问答系统, 垂直领域

Abstract:

[Objective] In order to improve the transparency of the policy, reduce the information asymmetry, and provide a convenient way for stakeholders to obtain agricultural policy information and guidance, this paper constructs an agricultural policy question answering system based on ChatGLM2-6B and Langchain-Chatchat. [Methods] To construct the agricultural policy question answering dataset, this paper first leverages web crawlers to obtain the full text of guiding agricultural policies of the National Rural Revitalization Administration and Central File No. 1, as well as the full text of agricultural policies of Huanghe-Nine provincial Rural Revitalization Bureau. The collected dataset is then used to fine-tune the ChatGLM2-6B model by QLoRA and conduct model consolidation and quantification. The obtained ChatGLM2-6B-QLoRA-int4 model is further combined with Langchain-Chatchat and local agricultural policy knowledge base to construct an agricultural policy question answering system. [Results] Questions were asked to ChatGPT, ChatGLM2-6B, ChatGLM2-6B-QLORa, and our question-and-answer system, respectively, and the answer results were evaluated by expert scoring method. Our system is better than ChatGLM2-6B and CHATGLM2-6B-QLORA in the field of agricultural policy, and the overall effect is better than ChatGPT. [Conclusion] The Q&A system constructed in this research performs well in the field of agricultural policy, and can ensure the security of proprietary data and realize the local deployment of LLM-based Q&A system.

Key words: large language model (LLM), agriculture, policy, question answering system, vertical domain