一种改进的BMUF训练框架及联邦学习系统实现

doi:10.11871/jfdc.issn.2096-742X.2022.06.010

数据与计算发展前沿 ›› 2022, Vol. 4 ›› Issue (6): 105-117.

CSTR: 32002.14.jfdc.CN10-1649/TP.2022.06.010

doi: 10.11871/jfdc.issn.2096-742X.2022.06.010

一种改进的BMUF训练框架及联邦学习系统实现

赵鑫博^1,²(),代闯闯^1,²,陆忠华^1,^*()

1.中国科学院计算机网络信息中心，北京 100083
2.中国科学院大学，北京 100049

收稿日期:2021-12-21 出版日期:2022-12-20 发布日期:2022-12-20
通讯作者: 陆忠华
作者简介:赵鑫博，中国科学院计算机网络信息中心，硕士研究生，主要研究方向为区块链与联邦学习。
本文中负责系统设计与实现，算法设计与实现，实验验证与文章撰写。
ZHAO Xinbo is currently a master student at the Computer Network Information Center, Chinese Acad-emy of Sciences, China. His main research interests are block-chain and federated learning.
In this paper, he is responsible for system design and impl-ementation, algorithm design and implementation, experimental verification, and paper writing.
E-mail: zhaoxinbo@cnic.cn|陆忠华，中国科学院计算机网络信息中心，研究员，主要研究方向为高性能计算技术和在计算金融中的应用。
本文中负责把握文章总体方向与框架。
LU Zhonghua is currently a Professor at the Computer Network Information Center, Chinese Academy of Sciences, China. Her current research interests include high-performance computing technology and its applications in computational finance.
In this paper, she is responsible for the overall direction and framework of the paper.
E-mail: zhlu@cnic.cn
基金资助:
国家自然科学基金(61873254)

An Improved BMUF Training Framework and Implementation of Federated Learning System

ZHAO Xinbo^1,²(),DAI Chuangchuang^1,²,LU Zhonghua^1,^*()

1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
2. University of Chinese Academy of Sciences, University, Beijing 100049, China

Received:2021-12-21 Online:2022-12-20 Published:2022-12-20
Contact: LU Zhonghua

摘要/Abstract

摘要：

【目的】在隐私保护日益严峻的环境下，联邦学习常被用于解决“数据孤岛”与“数据隐私”等问题，但传统的联邦学习架构受限于中心化特点，带来了额外的隐私风险与成本，基于区块链的去中心化联邦学习架构因其明显的应用优势得到了更多关注与研究。【方法】本文改进BMUF训练框架，使其在联邦学习中数据量分布不平衡（Unbalanced）、数据非独立同分布（Non-IID）场景下有较好效果;在客户端本地训练中加入差分隐私机制保护本地隐私;提出一种基于全局更新梯度的拜占庭检测鲁棒聚合算法，使聚合者可以检测出系统中存在的拜占庭客户端并完成鲁棒聚合。【结果】针对上述三点进行多组实验，实验结果表明改进的BMUF训练框架在Unbalanced与Non-IID场景下较FedAvg算法聚合效果更好;在客户端本地训练中加入差分隐私机制时，模型仍可收敛并获得较高准确率;在拜占庭攻击环境下，聚合者可以有效剔除拜占庭客户端并完成鲁棒聚合。【结论】本文改进BMUF训练框架，并实现了一个基于区块链的联邦学习系统，可以在去中心化架构下针对不同数据分布场景，有效保护客户端隐私，抵御拜占庭攻击，实现模型的高效训练。

关键词: 区块链, 联邦学习, BMUF框架, 差分隐私, 拜占庭攻击

Abstract:

[Objective] In the increasingly severe condition of privacy protection, federated learning is often used to solve problems such as "data islands" and "data privacy". However, the traditional federated learning architecture is limited by its centralized characteristic, which brings additional privacy risks and costs. The blockchain-based decentralized federated learning architecture has received more attention and research efforts due to its obvious advantages in application. [Methods] This paper improves the BMUF training framework to make better results in the scenarios of unbalanced and non-IID data distributions in federated learning. The differential privacy mechanism is added to the client's local training to protect local privacy. A byzantine detection and robust aggregation algorithm based on the global model update is proposed, which enables the aggregator to detect the byzantine clients in the system and complete the robust aggregation. [Results] Multiple experiments are conducted on the above three points. The experimental results show that the improved BMUF training framework has a better aggregation effect than the FedAvg algorithm in unbalanced and non-IID distribution scenarios. When the differential privacy mechanism is added to the client's local training, the model can still converge and obtain a higher accuracy rate. In the byzantine attack environment, the aggregator can effectively eliminate the byzantine clients and complete the robust aggregation. [Conclusions] This paper improves the BMUF training framework and implements a blockchain-based federated learning system that can effectively protect the local privacy of the client, resist byzantine attacks, and achieve efficient training of the model for different data distribution scenarios under a decentralized architecture.

Key words: blockchain, federated learning, BMUF framework, differential privacy, byzantine attack

赵鑫博,代闯闯,陆忠华. 一种改进的BMUF训练框架及联邦学习系统实现[J]. 数据与计算发展前沿, 2022, 4(6): 105-117.

ZHAO Xinbo,DAI Chuangchuang,LU Zhonghua. An Improved BMUF Training Framework and Implementation of Federated Learning System[J]. Frontiers of Data and Computing, 2022, 4(6): 105-117, https://cstr.cn/32002.14.jfdc.CN10-1649/TP.2022.06.010.

图/表 13

图1

图2

图3

图4

图5

表1

图6

图7

图8

表2

图9

参考文献 27

[1]	Roh Y, Heo G, Whang S E. A survey on data collection for machine learning: a big data-ai integration perspe-ctive[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 33(4): 1328-1347. doi: 10.1109/TKDE.2019.2946162
[2]	Krizhevsky A, Sutskever I, Hinton G E. Imagenet clas-sification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25(2): 1097-1105.
[3]	McMahan H B, Moore E, Ramage D, et al. Federated learning of deep networks using model averaging[J]. arXiv preprint arXiv:1602.05629, 2016.
[4]	Ramanan P, Nakayama K. Baffle: Blockchain based agg-regator free federated learning[C]// 2020 IEEE Internati-onal Conference on Blockchain (Block-chain), IEEE, 2020: 72-81.
[5]	Kim H, Park J, Bennis M, et al. Blockchained on-device federated learning[J]. IEEE Communications Letters, 2019, 24(6): 1279-1283. doi: 10.1109/LCOMM.2019.2921755
[6]	McMahan B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data[C]// Artificial intelligence and statistics, PMLR, 2017: 1273-1282.
[7]	Li Y, Chen C, Liu N, et al. A blockchain-based dece-ntralized federated learning framework with committee consensus[J]. IEEE Network, 2020, 35(1): 234-241.
[8]	Awan S, Li F, Luo B, et al. Poster: A reliable and acc-ountable privacy-preserving federated learning frame-work using the blockchain[C]// Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communi-cations Security, 2019: 2561-2563.
[9]	Lugan S, Desbordes P, Brion E, et al. Secure architectures implementing trusted coalitions for blockchained distri-buted learning (TCLearn)[J]. IEEE Access, 2019, 7: 181789-181799. doi: 10.1109/ACCESS.2019.2959220
[10]	Zhu X, Li H, Yu Y. Blockchain-Based privacy preserving deep learning[C]// International Conference on Infor-mation Security and Cryptology, Springer, Cham, 2018: 370-383.
[11]	Chen X, Ji J, Luo C, et al. When machine learning meets blockchain: A decentralized, privacy-preserving and secure design[C]// 2018 IEEE international conference on big data (big data), IEEE, 2018: 1178-1187.
[12]	Lu Y, Huang X, Dai Y, et al. Blockchain and federated learning for privacy-preserved data sharing in industrial IoT[J]. IEEE Transactions on Industrial Informatics, 2019, 16(6): 4177-4186. doi: 10.1109/TII.2019.2942190
[13]	Liu Y, Peng J, Kang J, et al. A secure federated learning framework for 5G networks[J]. IEEE Wireless Comm-unications, 2020, 27(4): 24-31.
[14]	Chen L, Charles Z, Papailiopoulos D. Draco: Robust distributed training via redundant gradients[J]. arXiv preprint arXiv:1803.09877, 2018.
[15]	Guerraoui R, Rouault S. The hidden vulnerability of dis-tributed learning in byzantium[C]// International Confer-ence on Machine Learning, PMLR, 2018: 3521-3530.
[16]	Blanchard P, El Mhamdi E M, Guerraoui R, et al. Ma-chine learning with adversaries: Byzantine tolerant grad-ient descent[J]. Advances in Neural Information Proces-sing Systems, 2017, 30.
[17]	Muñoz-González L, Co K T, Lupu E C. Byzantine-robust federated machine learning through adaptive model averaging[J]. arXiv preprint arXi-v:1909.05125, 2019.
[18]	Yousefpour A, Shilov I, Sablayrolles A, et al. Opacus: User-friendly differential privacy library in PyTorch[J]. arXiv preprint arXiv:2109.12298, 2021.
[19]	Nakamoto S. Bitcoin: A peer-to-peer electronic cash system[J]. Decentralized Business Review, 2008: 21260.
[20]	工信部. 中国区块链技术和应用发展白皮书[R/OL]. [2016-10-18]. http://www.199it.com/archives/526865.html.
[21]	陈凯. 深度学习模型的高效训练算法研究[D]. 中国科学技术大学, 2016.
[22]	Chen K, Huo Q. Scalable training of deep learning mac-hines by incremental block training with intrablock parallel optimization and blockwise model-update filt-ering[C]// 2016 ieee international conference on acoustics, speech and signal processing (icassp), IEEE, 2016: 5880-5884.
[23]	Dwork C. Differential privacy: A survey of results[C]// International conference on theory and applications of models of computation. Springer, Berlin, Heidelberg, 2008: 1-19.
[24]	Dwork C, McSherry F, Nissim K, et al. Calibrating noise to sensitivity in private data analysis[C]// Theory of cryp-tography conference. Springer, Berlin, Heidelberg, 2006: 265-284.
[25]	McSherry F, Talwar K. Mechanism design via diffe-rential privacy[C]// 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07), IEEE, 2007: 94-103.
[26]	Dwork C, Kenthapadi K, McSherry F, et al. Our data, ourselves: Privacy via distributed noise generation[C]// Annual international conference on the theory and applic-ations of cryptographic techniques, Springer, Berlin, Heidelberg, 2006: 486-503.
[27]	Lecun Y, Cortes C. The MNIST database of handwritten digits[J/OL]. 2010. http://yann.lecun.com/exdb/mnist/.

Layer(type)	Output Shape	Param
Conv2d-1	[-1,32,24,24]	832
MaxPool2d-2	[-1,32,12,12]	0
Conv2d-3	[-1,64,8,8]	51264
MaxPool2d-4	[-1,64,4,4]	0
Dropout-5	[-1,64,4,4]	0
Linear-6	[-1,512]	524800
Linear-7	[-1,256]	131328
Linear-8	[-1,10]	2570

参与方数量	隐私预算	Test Accuracy
10	ε=1	0.9225
	ε=2	0.9278
	ε=∞	0.9603
15	ε=1	0.9174
	ε=2	0.9235
	ε=∞	0.9578
20	ε=1	0.9141
	ε=2	0.9204
	ε=∞	0.9566

一种改进的BMUF训练框架及联邦学习系统实现

An Improved BMUF Training Framework and Implementation of Federated Learning System

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 27

相关文章 15

编辑推荐

Metrics

本文评价

[1]	郑歆凤,王建均,黄敬彬,饶强,潘金木,叶沁丹. 基于区块链技术的“三体五信”算网运营体系研究[J]. 数据与计算发展前沿, 2022, 4(6): 38-54.
[2]	王晶,张海明,温亮明,马卓然. 基于联盟链的科研云联邦计量系统研究设计[J]. 数据与计算发展前沿, 2022, 4(2): 109-120.
[3]	李浩,李新,陈远平. 区块链在电子发票报销中的创新应用模式[J]. 数据与计算发展前沿, 2021, 3(4): 116-125.
[4]	王嘉麒,杜义华,赵以霞. 基于综合影响力和情感特征的意见领袖发现方法[J]. 数据与计算发展前沿, 2021, 3(4): 126-139.
[5]	陈磊,刘文懋. 合规视角下的数据安全技术前沿与应用[J]. 数据与计算发展前沿, 2021, 3(3): 19-31.
[6]	翟冉,陈学斌. 区块链的共识机制研究[J]. 数据与计算发展前沿, 2021, 3(3): 86-94.
[7]	袁勇,欧阳丽炜,王晓,王飞跃. 基于区块链的智能组件：一种分布式人工智能研究新范式[J]. 数据与计算发展前沿, 2021, 3(1): 1-14.
[8]	关建峰,牛晓彤,高先明,延志伟. SDN多控制器共识机制研究综述[J]. 数据与计算发展前沿, 2021, 3(1): 15-33.
[9]	闾海荣,姜楠,许瑞坤,周容辰. 区块链在物联网中的应用态势分析[J]. 数据与计算发展前沿, 2021, 3(1): 34-47.
[10]	王丽娟,刘佳,王姝,郭志斌,周园春. 基于区块链的工业互联网标识公共服务应用初探[J]. 数据与计算发展前沿, 2021, 3(1): 60-73.
[11]	庄丽婉,金韬,张晨,黄韬. 基于区块链的软件定义广域网系统研究与设计[J]. 数据与计算发展前沿, 2020, 2(5): 41-51.
[12]	张曼,李洪涛,董科军,延志伟. 基于区块链的网络空间标识服务[J]. 数据与计算发展前沿, 2020, 2(5): 52-64.
[13]	刘加梦,彭绍亮,李肯立,蒋洪波,龙承念. 基于区块链的中草药质量安全管理模型[J]. 数据与计算发展前沿, 2020, 2(5): 65-75.
[14]	刘思瀚,徐石成,何光宇. 基于区块链技术的电动汽车电池溯源系统构建[J]. 数据与计算发展前沿, 2020, 2(5): 76-83.
[15]	章庆,高剑,秦启强,尹可挺. 基于联盟区块链的债券登记托管和交易报告探索[J]. 数据与计算发展前沿, 2020, 2(5): 84-98.