数据与计算发展前沿 ›› 2022, Vol. 4 ›› Issue (5): 129-137.

CSTR: 32002.14.jfdc.CN10-1649/TP.2022.05.014

doi: 10.11871/jfdc.issn.2096-742X.2022.05.014

• 技术与应用 • 上一篇    下一篇

一种基于自动标注语料的热点事件情感分析方法及应用

易寒冰*(),刘倩   

  1. 公安部第一研究所,北京 100048
  • 收稿日期:2021-11-18 出版日期:2022-10-20 发布日期:2022-10-27
  • 通讯作者: 易寒冰
  • 作者简介:易寒冰,公安部第一研究所, 工程师,硕士,主要研究方向为大数据挖掘分析、自然语言处理。
    本文主要负责论文撰写,实验处理。
    YI Hanbing, master’s degree, is an en-gineer of First Research Institute of The Ministry of Public Security of PRC. Her research interests in-clude big data and data mining, natural language processing.
    In this paper, she is responsible for the paper writing and ex-periments.
    E-mail: ayhb@ruc.edu.cn

An Sentiment Analysis Method of Hot Events Based on Automatically Labeled Corpus and Its Application

YI Hanbing*(),LIU Qian   

  1. First Research Institute of the Ministry of public security of PRC, Beijing 100048, China
  • Received:2021-11-18 Online:2022-10-20 Published:2022-10-27
  • Contact: YI Hanbing

摘要:

【目的/意义】随着自媒体的快速兴起,境内外社交媒体平台成为了各类新闻事件快速传播的重要渠道,也是广大网友表达观点、获取信息的重要平台。相应地,通过对社交平台上网友在热点事件中发表的言论进行情感倾向分析挖掘也成了热点研究问题,有效的情感分析能快速获取事件走势、公众观点等重要信息。【方法/过程】本文主要以境外社交平台上热点事件下的言论作为数据源,设计了针对非正式、非结构化、表情符号偏多的网络文本预处理分析方法,并基于PMI+SKEP模型对文本进行情感倾向分析,最后对情感分析结果进行应用研究。【结果/结论】本文的方法解决了实际应用中的业务数据缺少标注数据,需要大量人工标注的难点,模型准确率比ERNIE模型提高了3.17%。另外通过对用户言论进行情感倾向预测,获取到事件随时间变化趋势,以及事件发酵过程中负向言论传播的重要用户等,并将结果应用到实战系统中。

关键词: 社交媒体, PMI, SKEP, 情感分析, 应用

Abstract:

[Objective/Significance] With the rapid rise of we-media, domestic and foreign social media platforms have become an important channel for the rapid dissemination of various news events, and also an important platform for netizens to express their views and obtain information. Accordingly, it has become a hot research issue to obtain information by analyzing the sentiment of remarks on hot issues. Effective sentiment analysis can quickly obtain vital information such as event trends, public opinions, and attitudes. [Methods/Processes] the data source of this paper is comments on overseas social media platforms; First of all, for the network text with informal, unstructured, and too many emoticons, this paper designs the method of preprocessing data, including regularization, language detection, traditional to simplified, word segmentation, etc. Then, using Pointwise Mutual Information(PMI) and SKEP model to sentiment analysis. Finally, studying the application of sentiment analysis results. [Results/Conclusions] The method in this paper solves the difficulty that the data of practical application lacks annotation data. The accuracy of the model is 3.17% higher than that of the ERNIE model. In addition, by predicting the emotional tendency of users’ speech, we can earn high-quality intelligence, including the changing trend of topics over time and the key users in negative speech communication, etc. And the results are applied to the actual combat system.

Key words: social media, PMI, SKEP, sentiment analysis, application