Frontiers of Data and Computing ›› 2021, Vol. 3 ›› Issue (2): 103-111.doi: 10.11871/jfdc.issn.2096-742X.2021.02.012

• Technology and Applicaton • Previous Articles     Next Articles

Automatic Summarization of e-Government Documents Based on Sentence Vector Representation and Fuzzy C-Means

QI Rongling1,2(),JIAO Wenbin1,*(),WANG Yang1()   

  1. 1. Computer network information center, Chinese Academy of Sciences, Beijing 100190,China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2020-10-23 Online:2021-04-20 Published:2021-05-18
  • Contact: JIAO Wenbin E-mail:qirongling@cnic.cn;wbjiao@cnic.cn;wangyang@cnic.cn

Abstract:

[Objective] With the development of "Internet + E-Government", more and more attention has been paid to the construction of electronic information technology in China. For government decision-makers, managers, information workers and researchers, there is an urgent need to quickly and effectively obtain plenty of E-Government information to guide information evaluation and decision-making. This paper studies an automatic summarization algorithm for e-government documents. [Methods] According to the characteristics of e-government information text, this paper proposes an algorithm that uses Doc2Vec sentence vector representation and fuzzy c-means to automatically generate the summary of e-government information documents. It not only considers the correlation between sentences, but also gives weight to each sentence to express its importance as a summary sentence according to the characteristics of the article. [Results] Experiments show that, compared with the commonly used k-means algorithm and complex deep learning algorithms, this algorithm achieves better results in automatic generation of e-government information documents. [Conclusions] The proposed algorithm is effective for automatic document digest in the field of e-government.

Key words: automatic abstract, e-government, Doc2Vec, fuzzy c-means algorithm, informatization evaluation