基于多模型协作与动态加权裁决的农业科学元数据智能审核系统框架

doi:10.11871/jfdc.issn.2096-742X.2026.02.013

数据与计算发展前沿 ›› 2026, Vol. 8 ›› Issue (2): 171-183.

CSTR: 32002.14.jfdc.CN10-1649/TP.2026.02.013

doi: 10.11871/jfdc.issn.2096-742X.2026.02.013

基于多模型协作与动态加权裁决的农业科学元数据智能审核系统框架

任有强^1,^2,³(),赵慧^1,²,李威^1,^2,³,袁欢¹,樊景超^1,^2,^3,^*(),张建华^1,^2,^3,⁴,周国民^2,^5,⁶

¹ 中国农业科学院农业信息研究所/农业农村部农业大数据重点实验室，北京 100081
² 国家农业科学数据中心，北京 100081
³ 三亚中国农业科学院国家南繁研究院，海南三亚 572024
⁴ 海南省种业实验室，海南三亚 572024
⁵ 农业农村部南京农业机械化研究所，江苏南京 210014
⁶ 中国农业科学院西部农业研究中心，新疆昌吉 831100

收稿日期:2025-07-17 出版日期:2026-04-20 发布日期:2026-04-23
通讯作者: *樊景超（E-mail: fanjingchao@caas.cn）
作者简介:任有强，中国农业科学院农业信息研究所，硕士研究生，研究方向为农业信息技术。
本文中主要工作为提出研究思路，设计实验方案，构建系统，撰写论文。
REN Youqiang is a master’s student at the Agricultural Information Institute of the Chinese Academy of Agricultural Sciences (CAAS). His research focuses on agricultural information technology.
In this paper, he is responsible for proposing the research ideas, designing the experimental scheme, constructing the system, and writing the manuscript.
E-mail:17864179721@163.com|樊景超，中国农业科学院农业信息研究所，博士，副研究员，研究方向为农业大数据。
本文中主要工作为设计研究方案，实验结果的解读与分析。
FAN Jingchao, Ph.D., is an associate researcher at the Agricultural Information Institute of the Chinese Academy of Agricultural Sciences (CAAS). His research focuses on agricultural big data.
In this study, he is responsible for designing the research scheme and analyzing and interpreting the experimental results.
E-mail: fanjingchao@caas.cn
基金资助:
国家重点研发计划(2022YFF0711800);国家重点研发计划(2022YFD1600300);海南省自然科学基金(325MS155);三亚崖州湾科技城科技专项资助(SCKJ-JYRC-2023-45);三亚中国农业科学院国家南繁研究院南繁专项(YBXM2430);三亚中国农业科学院国家南繁研究院南繁专项(YBXM2508);三亚中国农业科学院国家南繁研究院南繁专项(YBXM2509);中央级公益性科研院所基本科研业务费专项(JBYW-AII-2025-05);中央级公益性科研院所基本科研业务费专项(Y2025YC90);国家农业科学数据中心项目(NASDC2025XM11);三亚崖州湾科技城管理局海南省种业实验室2025年产业科技创新“揭榜挂帅”联合项目(B25H1JC14)

A Framework for Intelligent Auditing of Agricultural Science Metadata Based on Multi-Model Collaboration and Dynamic Weighted Adjudication

REN Youqiang^1,^2,³(),ZHAO Hui^1,²,LI Wei^1,^2,³,YUAN Huan¹,FAN Jingchao^1,^2,^3,^*(),ZHANG Jianhua^1,^2,^3,⁴,ZHOU Guomin^2,^5,⁶

¹ Institute of Agricultural Information, Chinese Academy of Agricultural Sciences / Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100081, China
² National Agricultural Science Data Center, Beijing 100081, China
³ Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, Hainan 572024, China
⁴ Hainan Provincial Seed Industry Laboratory, Sanya, Hainan 572024, China
⁵ Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing, Jiangsu 210014, China
⁶ Western Research Institute, Chinese Academy of Agricultural Sciences, Changji, Xinjiang 831100, China

Received:2025-07-17 Online:2026-04-20 Published:2026-04-23

摘要/Abstract

摘要：

【目的】 为解决农业元数据人工审核效率低、标准不一的技术问题，提出一种面向质量治理的智能审核框架。【方法】 该框架采用两阶段分层策略：第一阶段利用多种异构大语言模型（LLMs）并行执行初审，并结合“算法三角测量”从语义、词汇、结构三维度评估输出一致性；第二阶段引入动态加权系统总分歧度（DWSD）算法，量化模型间冲突并动态触发高性能裁决模型，实现高难度样本的精确审核。【结果】 在真实农业元数据集上，该框架在F1分数、精确率和召回率上较最佳基线模型分别提升15.24%、14.17%和15.00%，显著优化了审核准确性与覆盖率。【局限】研究在单一中文农业数据集上验证，其跨语言、跨领域的泛化能力有待检验；所应用的偏见缓解策略在更大规模数据下的有效性也需持续评估。【结论】 本研究提出的智能框架可有效提升农业元数据审核的智能化、精确化与可解释性，并为多语言、多领域的数据质量治理任务提供可推广的技术路径。

关键词: 农业元数据审核, 多模型协作, LLM-as-a-Judge, 动态加权, DWSD

Abstract:

[Objective] This study proposes a quality governance-oriented intelligent auditing framework to address inefficiency, inconsistent standards in agricultural science metadata review. [Methods] The framework adopts a two-stage hierarchical strategy: (1) Multiple heterogeneous Large Language Models (LLMs) perform parallel preliminary audits, evaluated through an “algorithmic triangulation” method across semantic, lexical, and structural dimensions; (2) A Dynamic Weighted System Disagreement (DWSD) algorithm quantifies inter-model conflicts and dynamically triggers high-performance adjudicator models for precise handling of challenging cases. [Results] On a real-world agricultural metadata dataset, the proposed framework improved F1-score, precision, and recall by 15.24%, 14.17%, and 15.00%, respectively, over the best baseline, significantly enhancing audit accuracy and coverage. [Limitations] The model was validated on a single Chinese agricultural dataset; its generalizability across different languages and domains requires further testing. Additionally, the effectiveness of the applied bias mitigation strategies at a larger scale needs ongoing assessment. [Conclusions] The intelligent framework proposed in this study can effectively enhance the intelligence, accuracy, and interpretability of agricultural metadata auditing, and provide a generalizable technical path for data quality governance tasks across multiple languages and domains.

Key words: agricultural metadata audit, multi-model collaboration, LLM-as-a-Judge, Dynamic Weighting, DWSD

任有强, 赵慧, 李威, 袁欢, 樊景超, 张建华, 周国民. 基于多模型协作与动态加权裁决的农业科学元数据智能审核系统框架[J]. 数据与计算发展前沿, 2026, 8(2): 171-183.

REN Youqiang, ZHAO Hui, LI Wei, YUAN Huan, FAN Jingchao, ZHANG Jianhua, ZHOU Guomin. A Framework for Intelligent Auditing of Agricultural Science Metadata Based on Multi-Model Collaboration and Dynamic Weighted Adjudication[J]. Frontiers of Data and Computing, 2026, 8(2): 171-183, https://cstr.cn/32002.14.jfdc.CN10-1649/TP.2026.02.013.

图/表 13

图1

表1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

表2

参考文献 19

[1]	樊景超, 满芮, 张翔鹤, 等. 国家农业观测数据共享元数据标准研究[J]. 农业大数据学报, 2020, 2(4): 14-19. doi: 10.19788/j.issn.2096-6369.200402
[2]	周国民, 樊景超. 农业科学观测数据汇聚管理平台设计与实现[J]. 农业大数据学报, 2019, 1(3). 38-45. doi: 10.19788/j.issn.2096-6369.190304
[3]	满芮, 王健. 我国农业元数据标准初探[J]. 中国科技资源导刊, 2016, 48(2): 7-12.
[4]	朱国良, 姚娜. 农业信息资源元数据研究进展[J]. 农业工程技术, 2023, 43(17): 16-17.
[5]	韦一金, 樊景超. 基于ChatGLM2-6B的农业政策问答系统[J]. 数据与计算发展前沿(中英文), 2024, 6(4): 116-127.
[6]	田永林, 王雨桐, 王兴霞, 等. 从RAG到SAGE: 现状与展望[J]. 自动化学报2025, 51(6): 1145-1169.
[7]	AMIRIZANIANI M, MARTIN E, ROOSTA T, et al. AuditLLM:A Tool for Auditing Large Language Models Using Multiprobe Approach[J/OL]. arXiv preprint arXiv: 2402.09334, 2024[2025-08-25]. https://doi.org/10.48550/arXiv.2402.09334.
[8]	AMIRIZANIANI M, YAO J, LAVERGNE A, et al. LLMAuditor:A Framework for Auditing Large Language Models Using Human-in-the-Loop[J/OL]. arXiv preprint arXiv: 2402.09346, 2024[2025-08-25]. https://doi.org/10.48550/arXiv.2402.09346.
[9]	HUANG J J, ZHU H R, XU C, et al. AuditWen:An Open-Source Large Language Model for Audit[J/OL]. arXiv preprint arXiv: 2410.10873, 2024[2025-08-25]. https://doi.org/10.48550/arXiv.2410.10873.
[10]	GRATTAFIORI A, DUBEY A, JAUHRI A, et al. The llama 3 herd of models[J/OL]. arXiv preprint arXiv: 2407.21783, 2024[2025-07-24]. https://doi.org/10.48550/arXiv.2407.21783.
[11]	HUI B, YANG J, CUI Z, et al. Qwen2.5-coder technical report[J/OL]. arXiv preprint arXiv: 2409.12186, 2024[2025-07-24]. https://doi.org/10.48550/arXiv.2409.12186.
[12]	YANG A, LI A, YANG B, et al. Qwen3 technical report[J/OL]. arXiv preprint arXiv: 2505.09388, 2025[2025-07-24]. https://doi.org/10.48550/arXiv.2505.09388.
[13]	GUO D, YANG D, ZHANG H, et al. DeepSeek-R1:Incentivizing reasoning capability in llms via reinforcement learning[J/OL]. arXiv preprint arXiv: 2501. 12948, 2025[2025-07-24]. https://doi.org/10.48550/arXiv.2501.12948.
[14]	RISTAD E S, YIANILOS P N. Learning string-edit distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 20(5): 522-532. doi: 10.1109/34.682181
[15]	黄代雄, 杨嘉玮, 周伟. 基于Jaccard的智能变电站配置文件模型比对算法研究与应用[J]. 黑龙江科学, 2025, 16(6): 120-123.
[16]	李鲲程, 刘秋月, 范春梅. 数据清洗中文本相似度算法的比较与优化[J]. 通信管理与技术, 2021(5): 16-18.
[17]	LONGPRE S, HOU L, VU T, et al. The flan collection: designing data and methods for effective instruction tuning[C]// Proceedings of the 40th International Conference on Machine Learning, Honolulu. PMLR, 2023:22631-22648.
[18]	WHITE J, FU Q, HAYS S, et al. A prompt pattern catalog to enhance prompt engineering with chatgpt[J/OL]. arXiv preprint arXiv: 2302.11382, 2023[2025-07-24]. https://doi.org/10.48550/arXiv.2302.11382.
[19]	SCHULHOFF S, ILIE M, BALEPUR N, et al. The prompt report:A systematic survey of prompting techniques[J/OL]. arXiv preprint arXiv: 2406.06608, 2024[2025-07-24]. https://doi.org/10.48550/arXiv.2406.06608.

项目	具体配置
CPU	2 * Intel(R) Xeon(R) Platinum 8360Y @2.40GHz
GPU	4*A6000,192G显存
CUDA	12.4
LLM（temperature=0.1）	qwen2.5:32b-instruct
	qwen3:32b-q8_0
	llama3.1:8b-instruct-fp16
裁决LLM（temperature=0.1）	qwen3:32b-q8_0
裁决LLM（temperature=0.1）	DeepSeek-R1:32b-qwen-distill-q8_0
Ollama	0.6.6
裁决指令	{ ``` "final_instruction": "指令：检测到低分歧。请综合以下相似建议的优点，生成一个最完善的最终决定。在'justification'中，说明你主要采纳了哪个模型的建议，以及是否对其进行了优化。" },
system_message	"""你是一位顶级的、严谨的农业科学元数据质量控制专家。你的工作核心是确保每一条元数据的质量。你的核心原则是：1. 准确性 2. 完整性 3. 规范性 4. 简洁性。（中间略过）你的最终输出必须是一个JSON数组，每个对象对应一项任务的裁决结果。T1-T6，6个任务要处理完整，每个对象必须包含以下四个字段： - 'taskId': 任务ID (例如T1)。 - 'final_decision': 你做出的最终裁决内容。 - 'status': 你的决策类型，必须是 '已修正' 或 '采纳最优' 中的一个。 - 'justification': 一个解释你决策的对象，必须包含 'critique' (对原始建议的简要评判) 和 'rationale' (你最终决策的形成理由) 两个子字段。"""

模型/系统	精确率	召回率	F1分数
Llama 3.1 8B	64.17%	65.00%	66.20%
Qwen2.5 14B	66.25%	66.67%	66.46%
Qwen2.5 32B	68.75%	80.00%	73.96%
Qwen3-32B（裁决）	81.25%	93.33%	86.90%
DeepSeek-32B（裁决）	82.92%	95.00%	89.2%

基于多模型协作与动态加权裁决的农业科学元数据智能审核系统框架

A Framework for Intelligent Auditing of Agricultural Science Metadata Based on Multi-Model Collaboration and Dynamic Weighted Adjudication

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 19

相关文章 0

编辑推荐

Metrics

本文评价