Review of Automatic Citation Classification Based on Deep Learning Technology

doi:10.11871/jfdc.issn.2096-742X.2023.04.008

Abstract

Abstract:

[Objective] The citation classification of scientific and technological literature is the basic work of academic influence evaluation and literature retrieval and recommendation. With the development of deep neural networks and pre-trained language models, the research on citation classification of scientific and technological literature has achieved great success. Many citation classification models, data sets, and methods for scientific and technological documents based on deep learning technology have been proposed in the literature. However, there is still a lack of comprehensive research on existing methods and the latest trends. This paper makes up for this gap. [Methods] This paper studies the citation classification model and data set of scientific and technological literature based on deep learning technology, compares and analyzes the performance of different models as well as their advantages and disadvantages, summarizes the citation classification technology for scientific and technological literacy, and discusses the future development direction. [Results] The classification model based on the pre-trained language model can effectively learn the global semantic representation, improve the problems of low training efficiency of RNNs (Recurrent Neural Networks) and limited length of dependent features of text sequences extracted by CNNs (Convolutional Neural Networks), and significantly improve the classification accuracy. [Limitations] This paper mainly introduces the progress of citation classification technology in scientific and technological literature, and does not comprehensively predict the development direction of technology in the future.

Key words: citation classification of scientific and technological documents, pre-trained language model, deep learning, natural language processing

LI JunFei, XU LiMing, WANG Yang, WEI Xin. Review of Automatic Citation Classification Based on Deep Learning Technology[J]. Frontiers of Data and Computing, 2023, 5(4): 86-100, https://cstr.cn/32002.14.jfdc.CN10-1649/TP.2023.04.008.

Figures/Tables 6

Table 1

Table 2

Table 3

Table 4

Table 5

Table 6

References 67

[1]	WEI X, WANG Y. Research and Practice on Evaluation System of Science and Technology Competitiveness[J]. Frontiers of Data and Computing, 2021, 3(1): 74-67.
[2]	WANG S W, XU Y J, CHEN Y P, et al. Influence Mech-anism of Code-Sharing on Paper Citations: An Empirical Analysis on Computer Science Field[J]. Frontiers of Data & Computing, 2021, 3(2): 93-102.
[3]	HJORLAND B, NIELSEN L K. Subject Access Points in Electronic Retrieval[J]. Annual Review of Information Science and Technology (ARIST), 2001, 35: 249-98.
[4]	HIRSCH J E. An index to quantify an individual's scien-tific research output[J]. Proceedings of the National Aca-demy of ences of the United States of America, 2005, 102(46): 16569-16572.
[5]	陈云伟. 科技评价计量方法述评[J]. 农业图书情报学报, 2020, 32(8): 8.
[6]	VOOS H, DAGAEV K S. Are All Citations Equal? Or, Did We Op. Cit. Your Idem?[J]. Journal of Academic Librarianship, 1976, 1(6): 19-21.
[7]	HERLACH G. Can retrieval of information from citation indexes be simplified? Multiple mention of a reference as a characteristic of the link between cited and citing article[J]. Journal of the Association for Information Science & Technology, 2010, 29(6): 308-310.
[8]	Small H G. Cited Documents as Concept Symbols[J]. Social Studies of Science, 1978, 8(3): 327-40. doi: 10.1177/030631277800800305
[9]	GARFIELD E. Can citation indexing be automated[C]// Statistical association methods for mechanized docum-entation, symposium proceedings, 1965, 269: 189-192.
[10]	MICHAEL J. Moravcsik and Poovanalingam Murugesan. Some results on the function and quality of citations[J]. Social Studies of Science, 1975, 5(1): 86-92. doi: 10.1177/030631277500500106
[11]	TEUFEL S, SIDDHARTHAN A, TIDHAR D. Automatic classification of citation function[C]// Proceedings of the 2006 conference on empirical methods in natural langu-age processing, 2006: 103-110.
[12]	ULRICH S. Ensemble-style Self-training on Citation Class-ification[J]. Proceedings of Ijcnlp, 2011: 623-631.
[13]	LI X, HE Y, MEYERS A, et al. Towards fine-grained citation function classification[C]// Proceedings of the Inter-national Conference Recent Advances in Natural Lang-uage Processing RANLP 2013, 2013: 402-407.
[14]	HERNANDEZ-ALVAREZ M, GOMEZ J M. Survey about citation context analysis: Tasks, techniques, and resources[J]. Natural Language Engineering, 2016, 22(pt.3): 327-349. doi: 10.1017/S1351324915000388
[15]	MATTHEW E P, MARK N, MOHIT I, et al. Deep cont-extualizedword representations[J]. arXiv preprint arXiv: 1802.05365, 2018.
[16]	COHAN A, AMMAR W, VAN ZUYLEN M, et al. Structural Scaffolds for Citation Intent Classification in Scientific Publications[C]// Proceedings of NAACL-HLT, 2019: 3586-3596.
[17]	ZHU X, TURNEY P, LEMIRE D, et al. Measuring acad-emic influence: Not all citations are equal[J]. Journal of the Association for Information Science and Technology, 2015, 66(2): 408-427. doi: 10.1002/asi.2015.66.issue-2
[18]	VALENZUELA M, HA V, ETZIONI O. Identifying mea-ningful citations[C]// Workshops at the twenty-ninth AA-AI conference on artificial intelligence, 2015: 21-26.
[19]	JHA R, JBARA A A, QAZVINIAN V, et al. NLP-driven citation analysis for scientometrics[J]. Natural Language Engineering, 2016, 1(PT.1): 1-38. doi: 10.1017/S1351324900000036
[20]	GARZONE M A. Automated classification of citations using linguistic semantic grammars[D]. The University of Western Ontario (Canada), 1997.
[21]	NANBA H, KANDO N, OKUMURA M. Classification of research papers using citation links and citation types: Towards automatic review article generation[J]. Adva-nces in Classification Research Online, 2000, 11(1): 117-134.
[22]	PHAM S B, HOFFMANN A. A new approach for scie-ntific citation classification using cue phrases[C]// Austr-alasian Joint Conference on Artificial Intelligence, Spr-inger, Berlin, Heidelberg, 2003: 759-771.
[23]	COVER T, HART P. Nearest neighbor pattern classifi-cation[J]. IEEE transactions on information theory, 1967, 13(1): 21-27. doi: 10.1109/TIT.1967.1053964
[24]	ANGROSH M A, CRANEFIELD S, STANGER N. Cont-ext identification of sentences in related work sections using a conditional random field: towards intelligent digital libraries[C]// Proceedings of the 10th annual joint conference on Digital libraries, 2010: 293-302.
[25]	LAFFERTY J, MCCALLUM A, PEREIRA F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]. In Proceed-ings of ICML-01, 2001: 282-289.
[26]	尹莉, 郭璐, 李旭芬. 基于引用功能和引用极性的一个引用分类模型研究[J]. 情报杂志, 2018, 37(7): 139-145.
[27]	CORTES C, VAPNIK V. Support-Vector Networks[J]. Machine Learning, 1995, 20(3): 273-297.
[28]	柏晗. 基于加权引文的贝叶斯分类研究[D]. 南京: 南京大学, 2016.
[29]	LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998.DOI:10.1109/5.726791. doi: 10.1109/5.726791
[30]	ELMAN J L. Finding Structure in Time[J]. Cognitive Science, 1990, 14(2): 179-211. doi: 10.1207/s15516709cog1402_1
[31]	HOCHREITER S, SCHMIDHUBER J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780. doi: 10.1162/neco.1997.9.8.1735 pmid: 9377276
[32]	VASWANI A, SHAZEER N, PARMAR N, et al. Atte-ntion Is All You Need[J]. arXiv, 2017.DOI:10.48550/arXiv.1706.03762. doi: 10.48550/arXiv.1706.03762
[33]	CHEN Y. Convolutional neural network for sentence cla-ssification[D]. University of Waterloo, 2015.
[34]	Edouard Grave. Facebookresearch[EB/OL]. https://git-hub.com/facebookresearch/fastText.
[35]	JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of Tri-cks for Efficient Text Classification[C]// Proceedings of the 15th Conference of the European Chapter of the Assoc-iation for Computational Linguistics: Volume 2, Short Papers, 2017: 427-431.
[36]	JOULIN A, GRAVE E, BOJANOWSKI P, et al. Fasttext. zip: Compressing text classification models[J]. arXiv preprint arXiv:1612.03651, 2016.
[37]	YIN W, KANN K, YU M, et al. Comparative study of CNN and RNN for natural language processing[J]. arXiv preprint arXiv:1702.01923, 2017.
[38]	LAUSCHER A, GLAVAŠ G, PONZETTO S P, et al. Inv-estigating convolutional networks and domain-specific embeddings for semantic classification of citations[C]// Proceedings of the 6th international workshop on mining scientific publications, 2017: 24-28.
[39]	周文远, 王名扬, 井钰. 基于AttentionSBGMC模型的引文情感和引文目的自动分类研究[J]. 数据分析与知识发现, 2021, 5(12): 12.
[40]	CHO K, VAN MERRIËNBOER B, BAHDANAU D, et al. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches[C]// Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, 2014: 103-111.
[41]	SUTSKEVER I, VINYALS O, LE Q V, et al. Sequence to Sequence Learning with Neural Networks[C]// NIPS. European Language Resources Association (ELRA), 2014, 195: 496-527.
[42]	BOWMAN S, ANGELI G, POTTS C, et al. A large ann-otated corpus for learning natural language inference[C]// Proceedings of the 2015 Conference on Empirical Meth-ods in Natural Language Processing, 2015: 632-642.
[43]	MUNKHDALAI T, LALOR J P, YU H. Citation analysis with neural attention models[C]// Proceedings of the Seve-nth International Workshop on Health Text Mining and Information Analysis, 2016: 69-77.
[44]	HASSAN S U, IMRAN M, IQBAL S, et al. Deep context of citations using machine-learning models in scholarly full-text articles[J]. Scientometrics, 2018, 117(3): 1645-1662. doi: 10.1007/s11192-018-2944-y
[45]	BREIMAN. Random forests[J]. MACH LEARN, 2001, 45(1): 5-32. doi: 10.1023/A:1010933404324
[46]	PRESTER J, WAGNER G, SCHRYEN G, et al. Cla-ssifying the ideational impact of information systems review articles: A content-enriched deep learning appr-oach[J]. Decision Support Systems, 2021, 140: 113432. doi: 10.1016/j.dss.2020.113432
[47]	PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation[C]// Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014: 1532-1543.
[48]	JURGENS D, KUMAR S, HOOVER R, et al. Mea-suring the Evolution of a Scientific Field through Citat-ion Frames[J]. Transactions of the Association for Comp-utational Linguistics, 2018, 6(6): 391-406.
[49]	NICHOLSON J M, MORDAUNT M, LOPEZ P, et al. scite: a smart citation index that displays the context of citations and classifies their intent using deep learning[J]. Quantitative Science Studies, 2021, 2(3): 882-898. doi: 10.1162/qss_a_00146
[50]	BELTAGY I, LO K, COHAN A. SciBERT: A Pretrained Language Model for Scientific Text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Con-ference on Natural Language Processing (EMNLP-IJCN-LP), 2019: 3615-3620.
[51]	GE Y, DINH L, LIU X, et al. BACO: A Background Knowledge-and Content-Based Framework for Citing Sentence Generation[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long Papers), 2021: 1466-1478.
[52]	LAN Z, CHEN M, GOODMAN S, et al. Albert: A lite bert for self-supervised learning of language repre-sentations[J]. arXiv preprint arXiv:1909.11942, 2019.
[53]	YANG Z, DAI Z, YANG Y, et al. XLNet: generalized autoregressive pretraining for language understa-nding[C]// Proceedings of the 33rd International Confere-nce on Neural Information Processing Systems, 2019: 5753-5763.
[54]	ZHUANG L, WAYNE L, YA S, et al. A Robustly Optim-ized BERT Pre-training Approach with Post-training[C]// Proceedings of the 20th Chinese National Conference on Computational Linguistics, 2021: 1218-1227.
[55]	LIU G H, YANG J Y. Image retrieval based on the texton co-occurrence matrix[J]. Pattern Recognition, 2008, 41 (12) : 3521-3527. doi: 10.1016/j.patcog.2008.06.010
[56]	ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]// kdd, 1996, 96(34): 226-231.
[57]	Kenton J D M W C, Toutanova L K. BERT: Pre-training of Deep Bidirectional Transformers for Language Under-standing[C]// Proceedings of NAACL-HLT, 2019: 4171-4186.
[58]	LI B, ZHU Z, THOMAS G, et al. How is BERT surp-rised? Layerwise detection of linguistic anomalies[C]// Proceedings of the 59th Annual Meeting of the Assoc-iation for Computational Linguistics and the 11th Int-ernational Joint Conference on Natural Language Proc-essing (Volume 1:Long Papers), 2021: 4215-4228.
[59]	TUAROB S, KANG S W, WETTAYAKORN P, et al. Automatic classification of algorithm citation functions in scientific literature[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 32(10): 1881-1896. doi: 10.1109/TKDE.69
[60]	MERCIER D, RIZVI S T R, RAJASHEKAR V, et al. ImpactCite: An XLNet-based method for Citation Impact Analysis[J]. arXiv preprint arXiv:2005.06611, 2020.
[61]	ROMAN M, SHAHID A, KHAN S, et al. Citation intent classification using word embedding[J]. IEEE Access, 2021, 9: 9982-9995. doi: 10.1109/ACCESS.2021.3050547
[62]	CHEN H, NGUYEN H. Fine-tuning Pre-trained Con-textual Embeddings for Citation Content Analysis in Scholarly Publication[J]. arXiv preprint arXiv: 2009.05836, 2020.
[63]	LEI T. When Attention Meets Fast Recurrence: Tra-ining Language Models with Reduced Compute[C]// Proce-edings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021: 7633-7648.
[64]	ULRICH S. Ensemble-style Self-training on Citation Classification[C]. Proceedings of 5th International Joint Conference on Natural Language Processing, 2011: 623-631.
[65]	JHA R, JBARA A A, QAZVINIAN V, et al. NLP-driven citation analysis for scientometrics[J]. Natural Language Engineering, 2017, 23(1): 93-130. doi: 10.1017/S1351324915000443
[66]	LAUSCHER A, KO B, KUEHL B, et al. MultiCite: Mod-eling realistic citations requires moving beyond the sin-gle-sentence single-label setting[J]. arXiv preprint arXiv: 2107.00414, 2021.
[67]	XIE Y, SUN Y, BERTINO E. Learning domain sem-antics and cross-domain correlations for paper recommen-dation[C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021: 706-715.

数据集	样本数	分类标签及占比
Teufel et al.(2006b)^[11]	2829	Weak(3.1%)CoCoGM(3.9%)CoCoR0(0.8%)CoCo(1.0%)CoCoXY(2.9%)PBas(1.5%)PUse(15.8%)PModi(1.6%)PMot(2.2%)PSim(3.8%)PSup(1.1%)Neut(62.7%)
Ulrich(2011)^[12]	1768	Idea(23.80%)Basis(7.18%)Background(65.04%)Compare(3.95%)
Li et al.(2013)^[13]	6355	Based on(2.8%)Corroboration(3.6%)Discover(12.3%)Positive(0.1%)Significant(0.6%)Standard(0.2%)Supply(1.2%)Contrast(0.6%)Co-citation(33.3%)
Hernandez-Alvarez et al.(2016)^[14]	2120	Use(49.8%)Background(37.4%)Comparison(5.3%)Critique(7.8%)
Matthew et al.(2018)^[15]	3083	Background(51.8%)Uses(18.5%)Compares(17.5%)Motivation(4.9%)Continuation(3.7%)Future(3.6%)
Cohan et al.(2019)^[16]	11020	Background(58%) Method(29%) Result(13%)
Zhu et al.(2015)^[17]	3143	Influential Non-influential
Valenzuela et al. (2015)^[18]	450	Important Incidental
Jha et al.(2016)^[19]	3271	criticizing(16.3%)、comparison(8.1%)、use(18.0%)、substantiating(8%)、basis(5.3%)、neutral(44.3%)

引文功能类别	定义	样本数
Background	提供施引文献的领域相关信息	1021
Uses	使用被引文献的数据、方法等	365
Compare/Contrast	与施引文献进行相似性或者不同的对比	344
Motivation	引文展示了相关的数据、目标，方法等信息	98
Extends	扩展了引文的数据或者方法等	73
Future	引文是作者进一步工作的一部分	68

引文功能类别	定义	样本数
Background	引文陈述、提及或指向背景信息，提供关于问题、概念、方法、主题或领域中问题重要性的更多背景信息。	5837
Method	使用方法、工具、或数据集	2899
Result	将论文的结果/发现与其他工作的结果/发现进行比较	1368

实验模型	Precision (%)	Recall(%)	F1(%)	数据集
CNN General emb^[38]	79.9	68.2	73.6	Jha et al.(2016)^[27][19]
CNN CORE emb^[38]	80.8	68.8	74.3
CNN ACL emb^[38]	76.7	68.4	72.3
SciBERT-BiGRU-Multi-CNN^[39]	84.68	81.59	83.11
SciBERT-Multi-BiGRi-CNN-Attention^[39]	85.58	82.75	84.14
SciBERT-BiGRU-Multi-CNN-Attention^[39]	86.67	83.24	84.92

实验模型	数据集			F1(%)
实验模型	样本数量	分类类型	分类标签及比例	F1(%)
LSTMs^[43]	3422	单标签	Background(30.5%) Method(23.9%) Results/findings(45.3%) Don't know(0.1%)	66.42
LSTMs + Global Attention^[43]				68.61
BiLSTMs^[43]				67.88
BiLSTMs + Global Attention^[43]				68.61
BiLSTM-Attn^[16]	11020	单标签	Background(58%) Method(29%) Result comparison(13%)	77.2
BiLSTM-Attn w/ ELMo^[16]				82.6
BiLSTM-Attn + section title scaffold^[16]				77.8
BiLSTM-Attn + citation worthiness scaffold^[16]				78.1
BiLSTM-Attn + both scaffolds^[16]				79.1
BiLSTM-Attn w/ ELMo + both scaffolds^[16]				84