数据与计算发展前沿 ›› 2024, Vol. 6 ›› Issue (5): 102-110.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.05.010

doi: 10.11871/jfdc.issn.2096-742X.2024.05.010

• • 上一篇    下一篇

基于图划分的分布式推荐系统

杨锦光1(),熊菲1,*(),顾峻瑜2,3,席炜亭4   

  1. 1.北京交通大学,北京 100044
    2.中国科学院计算机网络信息中心,北京 100083
    3.中国科学院大学,北京 100049
    4.华北电力大学,北京 100096
  • 收稿日期:2023-01-15 出版日期:2024-10-20 发布日期:2024-10-21
  • 通讯作者: * 熊菲(E-mail: xiongf@bjtu.edu.cn
  • 作者简介:杨锦光,北京交通大学,硕士研究生,主要研究方向为人工智能、推荐系统。
    本文承担工作为:模型设计,模型算法实现。
    YANG Jinguang is a master’s student at Beijing Jiaotong University. His main research interests are artificial intelligence and recommender systems.
    In this paper, he is mainly responsible for model design and model algorithm realization.
    E-mail: yangjg@bjtu.edu.cn|熊菲,北京交通大学,博士生导师,主要研究方向为人工智能、网络内容安全、推荐系统等。
    本文承担工作为:指导优化模型和模型设计。
    XIONG Fei is a Ph.D. supervisor at Beijing Jiaotong University. His main research interests are artificial intelligence, network content security, and recommender systems.
    In this paper, he is mainly responsible for providing guidance for optimizing and designing models.
    E-mail: xiongf@bjtu.edu.cn
  • 基金资助:
    国家自然科学基金(61872033);国家自然科学基金(72004009);国家重点研发计划(2018YFC0832304);北京市科技新星计划(Z201100006820015)

A Distributed Recommender System Based on Graph Partition

YANG Jinguang1(),XIONG Fei1,*(),GU Junyu2,3,XI Weiting4   

  1. 1. Beijing Jiaotong University, Beijing 100044, China
    2. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
    3. University of Chinese Academy of Sciences, Beijing 100049, China
    4. North China Electric Power University, Beijing 100096, China
  • Received:2023-01-15 Online:2024-10-20 Published:2024-10-21

摘要:

【目的】设计一个数据处理效率高的推荐系统具有重要的意义。【方法】使用图结构来模拟推荐系统中的用户偏好关系,将其通过图划分算法处理,可以更深层次地挖掘推荐系统中数据的信息价值,并将得到的负载均衡的子图数据作为分布式系统的输入,最终经过一个自适应聚合模块的融合实现了一个分布式推荐系统。【结果】该系统可以提高推荐算法对于大规模数据的处理效率,在预测精度不下降的前提下,算法在一个由16个CPU构成的集群训练相比于单个CPU训练可提高6.4倍的效率。【结论】实验结果证明了该系统于推荐效率方面的有效性。

关键词: 推荐系统, 图划分, 负载均衡, 分布式系统

Abstract:

[Objective] It is of great significance to design a recommender system with high data processing efficiency. [Methods] The graph structure is used to simulate the user preference relationship in the recommender system. Through the graph partition algorithm processing, the information value of the data in the recommender system can be further mined, and the obtained subgraph data with load balancing can be used as the input of the distributed system. Finally, a distributed recommender system is implemented through the fusion of an adaptive aggregation module. [Results] The system can improve the processing efficiency of the recommender algorithm for large-scale data. On the premise that the prediction accuracy does not decline, the algorithm can improve the efficiency 6.4 times in a cluster training consisting of 16 CPUs compared with a single CPU training. [Conclusions] The experimental results show that the system is effective in recommendation efficiency.

Key words: recommender system, graph partition, load balancing, distributed system