数据与计算发展前沿 ›› 2022, Vol. 4 ›› Issue (2): 39-49.

doi: 10.11871/jfdc.issn.2096-742X.2022.02.004

• 专刊:先进智能计算平台及应用 • 上一篇    下一篇

基于知识图谱的图匹配文本分类

兰格(),王瑾瑜(),孙羽菲*(),张玉志()   

  1. 南开大学,软件学院,天津 300350
  • 收稿日期:2022-02-06 出版日期:2022-04-20 发布日期:2022-04-30
  • 通讯作者: 孙羽菲
  • 作者简介:兰格, 南开大学软件学院,博士研究生,目前主要从事文本分类、知识图谱的研究工作。
    在本文中负责摘要、分类算法、实验等内容的撰写。
    LAN Ge is currently a Ph.D student in the College of Software, Nankai University, Tianjin, China. Her current research int-erests include text classification and knowledge graph.
    In this paper, she completed the parts of the abstract, class-ification method, and experiment.
    E-mail: grettelan@mail.nankai.edu.cn|王瑾瑜,南开大学软件学院,硕士研究生,目前主要从事知识图谱的研究工作。
    在本文中负责背景、分类算法等内容的撰写。
    WANG Jinyu is currently a master’s stu-dent in the College of Software, Nankai University, Tianjin, China. Her current research interests incl-ude knowledge graph.
    In this paper, she completed the parts of the related work and classification method.
    E-mail: 2120210536@mail.nankai.edu.cn|孙羽菲,博士,南开大学软件学院,特聘研究员,主要研究方向为深度学习、异构计算、人工智能等。
    本文主要负责整体统稿、论文修改与审核相关工作。
    SUN Yufei, Ph.D, is a professor at the College of Software, Nankai University. Her research interests include deep learning, heterogeneous computing, artificial inte-lligence, etc.
    In this paper, she is mainly responsible for the final comp-ilation,the revision and review of the paper.
    E-mail: yufei_sun@sina.com|张玉志,南开大学讲席教授,软件学院院长,主要研究方向为人工智能、模式识别、自然语言处理等。
    在本文中负责论文修改与指导相关工作。
    ZHANG Yuzhi is the chair professor and the Dean of the School of Software at Nan-kai University. His research interests include on artificial intelli-gence, pattern recognition, natural language processing, etc.
    In this paper, he is mainly responsible for the final compilation and supervision of the paper.
    E-mail: zyz@nankai.edu.cn
  • 基金资助:
    国家重点研发计划(2021YFB0300104)

Graph Matching Text Classification Based on KG

LAN Ge(),WANG Jinyu(),SUN Yufei*(),ZHANG Yuzhi()   

  1. College of Software, Nankai University, Tianjin 300350, China
  • Received:2022-02-06 Online:2022-04-20 Published:2022-04-30
  • Contact: SUN Yufei

摘要:

【目的】在自然语言处理领域,文本分类是十分重要的基础研究,可以应用于许多下游任务中,例如文章检索、推荐系统、问答系统等。受到知识图谱在文本推理领域发挥作用的启发,本文探索了将知识图谱应用于文本分类任务的方法,在降低对标注训练数据依赖的同时利用知识图谱的推理能力提升文本分类的效果。【方法】本文提出了基于知识图谱的图匹配文本分类算法。具体而言,依据分类目标,为每一个类别构建了该类别的知识图谱,模型基于类别知识图谱中的语义和连接信息对文本与各个类别的相关性进行推理,综合各个知识图谱的推理评估结果。【结论】为了证明本文提出的方法的有效性,本文构建了分类所需的知识图谱并在两个数据集上进行了实验,实验结果证明在允许一定拒绝的前提下,此模型具有很高的准确率,进一步推动了算法的应用落地。

关键词: 文本分类, 知识图谱, 图匹配, 知识图谱构建, 信息抽取

Abstract:

[Objective] In the field of natural language processing (NLP), text classification is a well-developed task that benefits many downstream tasks such as article retrieval, recommendation systems, and question answering. Inspired by the role of knowledge graph (KG) in the field of text reasoning, this article explores the way of utilizing the reasoning capability of KG to support text classification. [Methods] This paper proposes graph matching text classification based on KG. Specifically, this paper constructs the corresponding KG for each class according to the task. The model utilizes the semantics and structure information of these KGs to evaluate the relevance of the text to each class’s KG and then classifies the text by synthesizing the evaluations of all KGs. [Conclusions] In order to prove the effectiveness of our proposed model, this paper builds all KGs of classes in two datasets and conducts experiments on those datasets. The experiment results prove that the proposed model achieves high accuracy under the premise of allowing some data to be rejected and further promotes the application of the method.

Key words: text classification, knowledge graph, graph matching, knowledge graph construction, information extraction