Frontiers of Data and Domputing ›› 2022, Vol. 4 ›› Issue (2): 63-73.

doi: 10.11871/jfdc.issn.2096-742X.2022.02.006

• Special Issue: Advanced Intelliget Computing Platform and Application • Previous Articles     Next Articles

Research On Chinese Name Recognition Based on Deep Learning and Coreference Resolution

CHEN Yu(),XUAN Yuhang(),ZHANG Yuzhi*()   

  1. School of Software, Nankai University, Tianjin 300450, China
  • Received:2022-02-13 Online:2022-04-20 Published:2022-04-30
  • Contact: ZHANG Yuzhi E-mail:2320200003@mail.nankai.edu.cn;xyh2575179890@163.com;zyz@nankai.edu.cn;zyz@nankai.edu.cn

Abstract:

[Objective] Named entity recognition is a basic task in the field of natural language processing. Entities include person names, place names, and organization names. Compared with other entities, person names are related to job titles, job changes, and personal pronouns. In the entity recognition of personal names, the incompleteness of the personal name corpus and the unclear personal designation have become difficulties and pain points in processing. Based on this observation, this paper proposes a sequence tagging method that integrates denotation resolution to improve name recognition, which can effectively alleviate the problem of incomplete name corpus in name recognition, and can solve the problems of unclear personal pronouns and high labor consumption. [Methods] Specifically, using job change to enhance data can effectively solve the problem of insufficient labeled data in practical applications. Then, to better learn contextual features, this approach uses the combination of language pre-training model BERT and bidirectional long-term memory network and uses conditional random field modeling to label the relationship of sequences. Finally, for the personal pronouns in the text, a coreference resolution algorithm is added to further improve name recognition. [Results] The experiment results on both public datasets and the datasets proposed in this paper demonstrate the effectiveness of the proposed method.

Key words: named entity recognition, coreference resolution, BERT, long short-term memory network