Frontiers of Data and Computing ›› 2023, Vol. 5 ›› Issue (2): 136-149.

CSTR: 32002.14.jfdc.CN10-1649/TP.2023.02.011

doi: 10.11871/jfdc.issn.2096-742X.2023.02.011

• Technology and Application • Previous Articles     Next Articles

An Empirical Study of the Phenomenon and Effect of “Content Convergence Gravity” in Social Network ——Data Mining of Sina Microblog Based on Word2vec

XU Xiang*(),ZHANG Lingyuan,WANG Yuchen   

  1. Research Center for Big Data and Computational Communication, College of Arts and Media, Tongji University, Shanghai 201804, China
  • Received:2022-02-14 Online:2023-04-20 Published:2023-04-24
  • Contact: XU Xiang


[Objective] In this study, the phenomenon and analytical dimension of “content convergence gravity” are clearly proposed to analyze the convergence effect of information in the social media field represented by Sina Microblog. [Methods] 14,111,274 valid sample posts on Sina Microblog were captured, and we used Word2vec and other text mining methods to examine: the similarities between each information level and all other levels, and the relationship between these similarities and the popularity of each level in a specific period of time, after dividing contents into “spectrums” of different popularity. [Results] The content similarity between any two information units G1, G2, is proportional to the sum of the popularity of the two units (H1+H2). The effect of “content convergence gravity” holds at scales ranging from a single post to multiple posts, and from microscopic fine-grained level to large scale level groups. [Limitations] There is still a lack of a more specific and profound analysis of the structural consequences and evolution laws of content convergence in social networks. [Conclusions] The perspective of “content convergence gravity” has expanded the theoretical possibilities and practical predictability for microblog information circulation, it also contains the potential information risk of “extreme public opinion" context and “anti-public sphere”.

Key words: social media, content convergence, user-generated content, homogeneity, text mining