数据与计算发展前沿 ›› 2022, Vol. 4 ›› Issue (3): 124-130.

CSTR: 32002.14.jfdc.CN10-1649/TP.2022.03.009

doi: 10.11871/jfdc.issn.2096-742X.2022.03.009

• 技术与应用 • 上一篇    下一篇

云服务网站用户复访行为预测模型研究

危婷(),张宏海(),蔺小丽(),张蕾蕾(),王妍(),贾金峰()   

  1. 中国科学院计算机网络信息中心,北京 100083
  • 收稿日期:2021-07-26 出版日期:2022-06-20 发布日期:2022-06-20
  • 通讯作者: 危婷
  • 作者简介:危婷, 中国科学院计算机网络信息中心,博士,主要研究方向为数据分析、云资源调度算法。
    本文中负责撰稿,中国科技云用户行为数据分析和建模。
    WEI Ting, Ph.D., works in Computer Network Information Center, Chinese Academy of Sciences. Her main research interests include data analysis and cloud resource scheduling algorithm.
    She is responsible for the paper writing, and data analysis and modeling of user behavior of CSTCloud.
    E-mail: weiting@cnic.cn|张宏海, 中国科学院计算机网络信息中心,硕士,副研究员,科技云发展部云服务软件研发业务室主任,主要研究方向为云资源的统一调度和云服务平台的研发。
    本文中负责总体统稿,科技云用户行为分析系统设计与应用。
    ZHANG Honghai is an associate researcher of Computer Net-work Information Center, Chinese Academy of Sciences, and a Director of Cloud Service Software Research and Development Business Department of Science and Technology Cloud Devel-opment Department. His main research interests include the unified scheduling of cloud resources and the research and development of cloud service platform.
    He is responsible for the final compilation, and the design and application of CSTCloud user behavior analysis system.
    E-mail: zhh@cnic.cn|蔺小丽,中国科学院计算机网络信息中心,硕士,目前主要从事系统部署、数据采集、数据库构建的工作。
    本文中负责用户行为分析系统数据采集部分。
    LIN Xiaoli works in Computer Network Information Center, Chinese Academy of Sciences. Her main research interests include system deploy-ment, data acquisition and database construction.
    She is responsible for data acquisition of user behavior analysis system.
    E-mail: linxiaoli@cnic.cn|张蕾蕾,中国科学院计算机网络信息中心,硕士,目前主要从事前端开发的工作。
    本文中负责用户行为分析系统的开发。
    ZHANG Leilei works in Computer Net-work Information Center, Chinese Aca-demy of Sciences. Her main research interests include frontend development.
    She is responsible for the development of user behavior analysis system.
    E-mail: zhangleilei@cnic.cn|王妍,中国科学院计算机网络信息中心,硕士,工程师,目前主要从事前端开发的工作。
    本文中负责用户行为分析系统的开发。
    WANG Yan works in Computer Net-work Information Center, Chinese Aca-demy of Sciences. Her main research interests include front-end development.
    She is responsible for the development of user behavior analy-sis system.
    E-mail: wangyan@cnic.cn|贾金峰,中国科学院计算机网络信息中心,硕士,目前主要从事系统开发的工作。
    本文中负责用户行为分析系统的开发。
    JIA Jinfeng works in Computer Network Information Center, Chinese Academy of Sciences. His main research interests include system deve-lopment.
    He is responsible for the development of user behavior analysis system.
    E-mail: jiajinfeng@cnic.cn

Predictive Model of the Revisit Behavior of Cloud Service Site Users

WEI Ting(),ZHANG Honghai(),LIN Xiaoli(),ZHANG Leilei(),WANG Yan(),JIA Jinfeng()   

  1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
  • Received:2021-07-26 Online:2022-06-20 Published:2022-06-20
  • Contact: WEI Ting

摘要:

【目的】为了解用户的兴趣与需求,提升在线推荐和网站运营效果,利用用户浏览和操作等行为数据来预测用户行为具有重要的价值。【方法】通过云服务网站用户行为数据采集、特征选择、挖掘分析,基于逻辑回归(LR)算法和XGBoost决策树算法去训练用户行为模型,并对用户的复访行为进行预测分析。基于云服务网站真实用户行为数据对两个模型进行多维度的数值评估。【结果】发现LR模型的拟合度和准确率都更胜一筹,这与以往较多认为XGBoost模型更优的结果不同,这是由行为数据结构的特点造成的。【结论】本文的研究有利于对云服务网站用户复访行为进行预测,以对潜在价值用户制定个性化的运营决策,提升用户体验。

关键词: 用户行为, 预测, 机器学习, 逻辑回归, 决策树

Abstract:

[Objective] In order to know users' interests and needs, and improve the effectiveness of online recommendation and website operation, it is of great value to predict user behavior based on user browsing and operation behavior data. [Methods] Through data collection, feature selection, data mining, and analysis of the user behavior of the China Science and Technology Cloud (CSTCloud) website, the user revisit behavior can be predictable. The Logical Regression(LR) model and XGBoost model are trained respectively to predict the user revisit behavior, and multi-dimensional numerical evaluation is performed through real user behavior data. [Results] The results show that the LR model has better fitness and accuracy, which is different from the previous opinion that the XGBoost model is better. Identifying the characteristics of the behavioral data structure is the main reason. [Conclusions] The research in this paper is conducive to predict revisit behavior of CSTcloud website users, which enables personalized operation decisions for potential valuable users and improves user experience.

Key words: user behavior, prediction, machine learning, logistic regression, decision tree