数据与计算发展前沿 ›› 2021, Vol. 3 ›› Issue (2): 60-67.

doi: 10.11871/jfdc.issn.2096-742X.2021.02.007

• 管理决策与智能应用专刊 • 上一篇    下一篇

基于树模型的财务报销审批预测

刘春雨1,2,*(),施卓敏1(),于建军1()   

  1. 1.中国科学院计算机网络信息中心,北京 100190
    2.中国科学院大学,北京 100049
  • 收稿日期:2021-03-05 出版日期:2021-04-20 发布日期:2021-05-18
  • 通讯作者: 刘春雨
  • 作者简介:刘春雨,中国科学院计算机网络信息中心,中国科学院大学,在读硕士研究生,主要研究方向为数据挖掘、机器学习、用户行为分析等。
    本文中负责数据分析、模型构建、实验设计及文献撰写。
    LIU Chunyu is a graduate student in Computer Network Information Center of Chinese Academy of Sciences. Her research interests cover data mining, machine learning, user behavior analysis, etc.
    In this paper, she is responsible for data analyzing, model construction, experiments design and paper writing.
    E-mail: liuchunyu@cnic.cn|施卓敏,中国科学院计算机网络信息中心,硕士,高级工程师,主要研究方向为大数据分析、管理信息化、智能财务应用等。
    本文中承担数据分析。
    SHI zhuomin, master, is a senior engineer at Computer Network Information Center, Chinese Academy of Sciences. Her research interests cover big data analysis, management informatization, and intelligent financial application.
    In this paper, she is responsible for data analysis.
    E-mail: zmshi@cnic.cn|于建军,中国科学院计算机网络信息中心,研究员,博士生导师,管理信息化部副主任,主要研究方向为大数据分析、协同推荐、云计算,当前主要从事新一代ARP相关技术研究。
    在本文中负责总体统稿。
    YU Jianjun is currently the researcher, doctoral supervisor, and the deputy Director of Management informazation Deparment, Computer Network Information Center, Chinese Academy of Sciences. His research interests cover big data analysis, collaborative filtering recommendation, and cloud computing. Recently, he is working on New ARP technical research.
    In this paper, he is responsible for the overall draft.
    E-mail: yujj@cnic.ac.cn

Tree Model Based Prediction of Financial Reimbursement Approval

LIU Chunyu1,2,*(),SHI Zhuomin1(),YU Jianjun1()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-03-05 Online:2021-04-20 Published:2021-05-18
  • Contact: LIU Chunyu

摘要:

【目的】针对目前科研院所财务报销不规范导致的反复审批等问题,本文研究通过预测结果提升报销审批效率。【方法】本文针对财务报销审批进行业务建模,形成可机器理解的报销审批脱敏数据,并根据实际业务特点构造变量特征与标签,采用随机森林对重构后的变量进行重要度分析。使用决策树、随机森林、梯度随机树以及XGBoost四种分类算法对报销审批结果进行预测。【结果】通过随机森林算法证实重构变量对于报销审批结果预测的可靠性。四种树模型根据重构后的训练数据集归纳出一组分类规则,采用该规则对未审批的报销单进行预测,通过预测结果从四种树模型中评定出最佳模型。【结论】文章基于树模型,通过构造随机森林辅助判断影响报销审批结果的关键因素,并选用树模型算法实现报销审批预测模型的构建,为树模型在报销审批预测中的应用提供了算法基础。

关键词: 变量重构, 数据挖掘, 树模型, 审批预测

Abstract:

[Objective] Nowadays, how to reduce repeated submissions caused by irregular reimbursements for financial reimbursement approval to improve the efficiency of financial reimbursement becomes a big issue in daily scientific research management of CAS institutes. This paper studies the use of prediction results to improve the efficiency of reimbursement approval. [Methods] This paper conducts a business model for financial reimbursement approval, obtains desensitized data for reimbursement approval that can be machine-understood, constructs variable characteristics and labels according to actual business characteristics, and then uses a random forest approach to analyze the importance of reconstructed variables. Decision tree, random forest, gradient random tree, and XGBoost algorithms are used to predict the reimbursement approval results. [Results] The importance analysis by constructing random forest makes the reconstruction variables more credible, provides reliable support for the approval results predicted by the subsequent four tree-model algorithms, and evaluates the best model from the results. [Conclusions] Based on the tree model, this paper identifies the key factors that affect the results of reimbursement approval and applies the machine learning algorithms to predict financial reimbursement approvals, which provides an application basis for tree models in predicting reimbursement approval.

Key words: variable reconstruction, data mining, tree model, prediction of approval