机器学习安全推理研究综述

doi:10.11871/jfdc.issn.2096-742X.2024.05.001

数据与计算发展前沿 ›› 2024, Vol. 6 ›› Issue (5): 1-12.

CSTR: 32002.14.jfdc.CN10-1649/TP.2024.05.001

doi: 10.11871/jfdc.issn.2096-742X.2024.05.001

机器学习安全推理研究综述

龙春^1,^*(),李丽莎^1,²,李婧¹,杨帆¹,魏金侠¹,付豫豪¹

1.中国科学院计算机网络信息中心，北京 100083
2.中国科学院大学，北京 100190

收稿日期:2024-08-13 出版日期:2024-10-20 发布日期:2024-10-21
通讯作者: * 龙春（E-mail: anquanip@cnic.cn）
作者简介:龙春，中国科学院计算机网络信息中心，正高级工程师，博士生导师。计算机学会安全专委会委员，中国互联网协会青年专家。主要从事智能网络安全保障、安全大数据挖掘与深度分析等方面的科研工作，获得北京市科学技术奖科学技术进步二等奖。
本文负责论文框架设计、文献分析。
LONG Chun is a senior engineer in the Computer Network Information Center, Chinese Academy of Sciences. He also serves as a Ph.D. supervisor at the University of Chinese Academy of Sciences. Member of the Security Committee of the Computer Society, and a young expert at the China Internet Association. Engaged in scientific research in the fields of intelligent network security protection, security big data mining, and in-depth analysis. He has won the second prize of the Science and Technology Progress Award from the Beijing Municipal Science and Technology Award.
In this paper, he is responsible for designing the framework and analyzing the literature.
E-mail: anquanip@cnic.cn
基金资助:
国家重点研发计划(2023YFC3304704);中国科学院网络安全和信息化专项(CAS-WX2022GC-04);中国科学院青年创新促进会项目(2022170)

Review of Research on Secure Inference in Machine Learning

LONG Chun^1,^*(),LI Lisha^1,²,LI Jing¹,YANG Fan¹,WEI Jinxia¹,Fu Yuhao¹

1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
2. University of Chinese Academy of Sciences, Beijing 100190, China

Received:2024-08-13 Online:2024-10-20 Published:2024-10-21

摘要/Abstract

摘要：

【目的】对机器学习安全推理现有的研究工作进行分析，对未来的研究方向进行展望。【方法】以不同方案的安全假设为分类依据，对采用不同的技术组合、应用于不同机器学习场景的安全推理技术进行分析比较。【结果】目前的方案可实现机器学习的安全推理，但在计算效率、安全保护能力、可扩展性以及实际应用场景的适应性方面存在局限。【局限】受限于能够获取到的资料，未能对所分析的方案在同一基准下进行实验及比较。【结论】根据应用场景进行机器学习安全推理的方案设计，在确保安全的前提下提高可用性并降低开销成本，将是该领域的长期发展方向。

关键词: 隐私保护机器学习, 机器学习, 数据隐私, 安全多方计算

Abstract:

[Objective] This paper analyzes existing research on secure machine learning inference and proposes future research directions. [Methods] Using the security assumptions of different schemes as a basis for classification, this study conducts analysis and comparison of secure inference techniques that utilize various technological combinations for application in different machine learning contexts. [Results] While current schemes facilitate secure machine learning inference, they exhibit limitations in computational efficiency, security, scalability, and practical applicability. [Limitations] Due to limited data availability, experiments and comparisons of the analyzed schemes under the same benchmark were not conducted. [Conclusions] Designing secure machine learning inference schemes based on application scenarios, ensuring security while improving usability and reducing costs, will be a sustained development direction in this field.

Key words: privacy-preserving machine learning, machine learning, data privacy, secure multi-party computation

龙春, 李丽莎, 李婧, 杨帆, 魏金侠, 付豫豪. 机器学习安全推理研究综述[J]. 数据与计算发展前沿, 2024, 6(5): 1-12.

LONG Chun, LI Lisha, LI Jing, YANG Fan, WEI Jinxia, Fu Yuhao. Review of Research on Secure Inference in Machine Learning[J]. Frontiers of Data and Computing, 2024, 6(5): 1-12, https://cstr.cn/32002.14.jfdc.CN10-1649/TP.2024.05.001.

图/表 4

图1

表1

表2

图2

参考文献 69

[1]	YAO A C. Protocols for secure computations[C]// 23rd annual symposium on foundations of computer science (sfcs 1982). IEEE, 1982: 160-164.
[2]	SONG L, LIN G, WANG J, et al. Sok: Training machine learning models over multiple sources with privacy preservation[J]. arXiv preprint arXiv: 2012.03 386, 2020.
[3]	郭娟娟, 王琼霄, 许新, 等. 安全多方计算及其在机器学习中的应用[J]. 计算机研究与发展, 2021, 58(10): 2163-2186.
[4]	HAO Y, QIN B, SUN Y. Privacy-preserving decision-tree evaluation with low complexity for communication[J]. Sensors, 2023, 23(5): 2624.
[5]	JI K, ZHANG B, LU T, et al. UC Secure private branching program and decision tree evaluation[J]. IEEE Transactions on Dependable and Secure Computing, 2022, 20(4): 2836-2848.
[6]	CHEN X, CHEN X, DONG Y, et al. Roger: A round optimized gpu-friendly secure inference framework[C]// ICC 2024-IEEE International Conference on Communications. IEEE, 2024: 61-66.
[7]	FAN T, CHEN X, DONG Y, et al. Comet: Communication-efficient batch secure three-party neural network inference with client-aiding[C]// ICC 2024-IEEE International Conference on Communications. IEEE, 2024: 752-757.
[8]	FENG Q, HE D, LIU Z, et al. SecureNLP: A system for multi-party privacy-preserving natural language processing[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 3709-3721.
[9]	HUANG Z, LU W, HONG C, et al. Cheetah: Lean and fast secure {Two-Party} deep neural network inference[C]// 31st USENIX Security Symposium (USENIX Security 22). 2022: 809-826.
[10]	DONG Y, CHEN X, JING W, et al. Meteor: improved secure 3-party neural network inference with reducing online communication costs[C]// Proceedings of the ACM Web Conference 2023. 2023: 2087-2098.
[11]	DONG Y, CHEN X, SONG X, et al. FLEXBNN: fast private binary neural network inference with flexible bit-width[J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 2382-2397.
[12]	LU Y, ZHANG B, REN K. Maliciously secure mpc from semi-honest 2 pc in the server-aided model[J]. IEEE Transactions on Dependable and Secure Computing, 2024 (4): 3109-3125.
[13]	LI Y, XU W. PrivPy: General and scalable privacy-preserving data mining[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019: 1299-1307.
[14]	SONG L, WANG J, WANG Z, et al. Pmpl: A robust multi-party learning framework with a privileged party[C]// Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 2022: 2689-2703.
[15]	MA J, ZHENG Y, FENG J, et al. {SecretFlow-SPU}: A performant and {user-friendly} framework for {privacy-preserving} machine learning[C]// 2023 USENI-X Annual Technical Conference (USENIX ATC 23). 2023: 17-33.
[16]	谭作文, 张连福. 机器学习隐私保护研究综述[J]. 软件学报, 2020, 31(7): 2127-2156.
[17]	HAZAY C, VENKITASUBURAMANIAM M, WE- ISS M. The price of active security in cryptographic protocols[C]// Annual International Conference on the Theory and Applications of Cryptographic Techniqu-es. Cham: Springer International Publishing, 2020: 1 84-215.
[18]	RABIN M O. How to exchange secrets with oblivious transfer[J]. IACR Cryptol. ePrint Arch, 2005 ( 2005): 187.
[19]	曲亚东, 侯紫峰, 韦卫. 基于不经意传输的合同签订协议[J]. 计算机研究与发展, 2003, (4): 615-619.
[20]	陈晓洪. 基于安全多方计算的电子投票系统应用研究[D]. 南京理工大学, 2010.
[21]	查俊. 安全多方计算在密钥协商中的应用研究[D]. 解放军信息工程大学, 2012.
[22]	李宗育, 桂小林, 顾迎捷, 等. 同态加密技术及其在云计算隐私保护中的应用[J]. 软件学报, 2018, 29(7):1830-1851.
[23]	ACAR A, AKSU H, ULUAGAC A S, et al. A survey on homomorphic encryption schemes: Theory and implementation[J]. ACM Computing Surveys (Csur), 2018, 51(4): 1-35.
[24]	RIVEST R L, SHAMIR A, ADLEMAN L. A method for obtaining digital signatures and public-key cryptosystems[J]. Communications of the ACM, 1978, 21(2): 120-126.
[25]	ElGAMAL T. A public key cryptosystem and a signature scheme based on discrete logarithms[J]. IEEE transactions on information theory, 1985, 31(4): 469-472.
[26]	PAILLIER P. Public-key cryptosystems based on composite degree residuosity classes[C]// International c- onference on the theory and applications of cryptographic techniques. Berlin, Heidelberg: Springer Berlin Heidelberg, 1999: 223-238.
[27]	BONEH D, GOH E J, NISSIM K. Evaluating 2-DNF formulas on ciphertexts[C]// Theory of Cryptography: Second Theory of Cryptography Conference, TCC 2005, Cambridge, MA, USA, February 10-12, 2005. Proceedings 2. Springer Berlin Heidelberg, 2005: 325-341.
[28]	GENTRY C. Fully homomorphic encryption using ideal lattices[C]// Proceedings of the forty-first annual ACM symposium on Theory of computing. 2009: 169-178.
[29]	SMART N P, VERCAUTEREN F. Fully homomorphic encryption with relatively small key and ciphertext sizes[C]// International Workshop on Public Key Cryptography. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010: 420-443.
[30]	VAN Dijk M, GENTRY C, HALEVI S, et al. Fully homomorphic encryption over the integers[C]// Adva-nces in Cryptology-EUROCRYPT 2010: 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, May 30-June 3, 2010. Proceedings 29. Springer Berlin Heidelberg, 2010: 24-43.
[31]	CORON J S, MANDAL A, NACCACHE D, et al. Fully homomorphic encryption over the integers with shorter public keys[C]// Annual Cryptology Confere-nce. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011: 487-504.
[32]	SHAMIR A. How to share a secret[J]. Communications of the ACM, 1979, 22(11): 612-613.
[33]	BEAVER D. Efficient multiparty protocols using circuit randomization[C]// Advances in Cryptology-CR-YPTO’91: Proceedings 11. Springer Berlin Heidelb-erg, 1992: 420-432.
[34]	BLAKLEY G R. Safeguarding cryptographic keys[C]// Managing requirements knowledge, international workshop on. IEEE Computer Society, 1979: 313-313.
[35]	GALTON F. Regression towards mediocrity in hereditary stature[J]. The Journal of the Anthropological Institute of Great Britain and Ireland, 1886, 15: 246-263.
[36]	CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3): 273-297.
[37]	QUINLAN J R. C4.5: Programs for machine learning[M]. Morgan Kaufmann, 1993.
[38]	LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[39]	NIKOLAENKO V, WEINSBERG U, IOANNIDIS S, et al. Privacy-preserving ridge regression on hund-reds of millions of records[C]// 2013 IEEE symposium on security and privacy. IEEE, 2013: 334-348.
[40]	GASCÓN A, SCHOPPMANN P, BALLE B, et al. Secure linear regression on vertically partitioned datasets[J]. IACR Cryptol. ePrint Arch. 2016 (2016): 892.
[41]	RAHULAMATHAVAN Y, PHAN R C W, VELURU S, et al. Privacy-preserving multi-class support vector machine for outsourcing the data classification in cloud[J]. IEEE Transactions on Dependable and Secure Computing, 2013, 11(5): 467-479.
[42]	RAHULAMATHAVAN Y, VELURU S, PHAN R C W, et al. Privacy-preserving clinical decision support system using gaussian kernel-based classification[J]. IEEE journal of biomedical and health informatics, 2013, 18(1): 56-66.
[43]	LIU X, LU R, MA J, et al. Privacy-preserving patient-centric clinical decision support system on naive Ba- yesian classification[J]. IEEE journal of biomedical and health informatics, 2015, 20(2): 655-668.
[44]	BOST R, POPA R A, TU S, et al. Machine learning classification over encrypted data[C]// Network and Distributed System Security Symposium. 2014. DOI:10.14722/ndss.2015.23241.
[45]	WU D J, FENG T, NAEHRIG M, et al. Privately evaluating decision trees and random forests[J]. Proceedings on Privacy Enhancing Technologies, 2016, (4): 335-355.
[46]	BACKES M, BERRANG P, BIEG M, et al. Identifying personal DNA methylation profiles by genotype inference[C]// 2017 IEEE symposium on security and privacy (SP). IEEE, 2017: 957-976.
[47]	DE COCK M, DOWSLEY R, HORST C, et al. Efficient and private scoring of decision trees, support vector machines and logistic regression models based on pre-computation[J]. IEEE Transactions on Dependable and Secure Computing, 2017, 16(2): 217-230.
[48]	KISS Á, NADERPOUR M, LIU J, et al. SoK: Modular and efficient private decision tree evaluation[J]. Proceedings on Privacy Enhancing Technologies, 2019, (2): 187-208.
[49]	TUENO A, KERSCHBAUM F, KATZENBEISSER S. Private evaluation of decision trees using sublinear cost[J]. Proceedings on Privacy Enhancing Technologies, 2019(1): 266-286.
[50]	MA J P K, TAI R K H, ZHAO Y, et al. Let’s stride blindfolded in a forest: sublinear multi-client decision trees evaluation[C]. Proceedings 2021 Network and Distributed System Security Symposium, 2021. DOI:10.14722/ndss.2021.23166.
[51]	XIE P, BILENKO M, FINLEY T, et al. Crypto-nets: Neural networks over encrypted data[J]. arXiv preprint arXiv:1412.6181, 2014.
[52]	GILAD-BACHRACH R, DOWLIN N, LAINE K, et al. Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy[C]// International conference on machine learning. PMLR, 20 16: 201-210.
[53]	LIU J, JUUTI M, LU Y, et al. Oblivious neural network predictions via minionn transformations[C]// Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 2017: 619-631.
[54]	RIAZI M S, WEINERT C, TKACHENKO O, et al. Chameleon: A hybrid secure computation framework for machine learning applications[C]// Proceedings of the 2018 on Asia conference on computer and communications security. 2018: 707-721.
[55]	JUVEKAR C, VAIKUNTANATHAN V, CHANDRA-KASAN A. {GAZELLE}:A low latency framework for secure neural network inference[C]// 27th USENIX security symposium (USENIX security 18). 2018: 1651-1669.
[56]	MISHRA P, LEHMKUHL R, SRINIVASAN A, et al. Delphi: A Cryptographic Inference Service for Neural Networks[C]// USENIX Security Symposium, 2020: 2505-2522.
[57]	RATHEE D, RATHEE M, KUMAR N, et al. Cryptflow2: Practical 2-party secure inference[C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 2020: 325-342.
[58]	BAI J, SONG X, ZHANG X, et al. Mostree: malicious secure private decision tree evaluation with sublinear communication[C]// Proceedings of the 39th Annual Computer Security Applications Conference. 2023: 799-813.
[59]	HAZAY C, ISHAI Y, MARCEDONE A, et al. LevioSA: Lightweight secure arithmetic computation[C]// Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 2019: 327-344.
[60]	LEHMKUHL R, MISHRA P, SRINIVASAN A, et al. Muse: Secure inference resilient to malicious clients[C]// 30th USENIX Security Symposium (USENIX Security 21). 2021: 2201-2218.
[61]	CHANDRAN N, GUPTA D, OBBATTU S L B, et al. {SIMC}:{ML} inference secure against malicious clients at {Semi-Honest} cost[C]// 31st USENIX Security Symposium (USENIX Security 22). 2022: 1361-1378.
[62]	DONG C, WENG J, LIU J N, et al. Fusion: Efficient and secure inference resilient to malicious servers[C]. Proceedings 2023 Network and Distributed System Security Symposium, 2023. DOI:10.14722/ndss.2023.23199.
[63]	贾轩, 白玉真, 马智华. 隐私计算应用场景综述[J]. 信息通信技术与政策, 2022, (5): 45-52.
[64]	胡浩. 隐私计算产业的发展及金融行业应用[J]. 银行家, 2023, (3): 108-110.
[65]	郑灏. 隐私计算在金融行业数据融合场景中的应用探析[J]. 中国金融电脑, 2022, (6): 90-91.
[66]	马龙, 陈奕博. 基于技术的治理: 隐私计算技术赋能政府数据开放的价值与路径研究[J]. 中国行政管理, 2023, 39(9): 105-113.
[67]	凡航, 徐葳, 范晓昱, 等. 隐私计算在新型电力系统中的应用分析与展望[J]. 电力系统自动化, 2023, 47(19): 187-199.
[68]	肖霞. 基于隐私计算的药物—药物相互作用预测方法研究[D]. 湖南大学, 2023.
[69]	辛均益, 陈如梵, 王林, 等. 生物医学大数据中的隐私计算[J]. 医学信息学杂志, 2022, 43(10): 2-7.

方案	机器学习方法	密码学技术	运算时间	通信开销	是否支持扩展为恶意安全假设
[39]	岭回归	GC+HE	很大	很大	√
[40]	线性回归	GC+HE	很大	—	√
[41]	支持向量机	HE	较大	—
[42]	支持向量机	HE	较大	—
[43]	朴素贝叶斯分类	HE	较小	—
[44]	超平面决策、朴素贝叶斯、决策树	HE	较大	较大
[45]	决策树、随机森林	HE+OT	较小	较小	√
[46]	随机森林	HE	较大	较大
[47]	决策树	SS+OT	较小	较大
[48]	决策树	HE/GC/OT	—	—
[49]	决策树	SS+OT+GC	较小	较大
[50]	决策树	SS+OT+GC	较小	较大	√
[4]	决策树	HE	较大	较小
[5]	决策树	OT+SS	较小	较小

方案	神经网络规模	数据集规模	密码学技术	通信开销	运算时间
crypto-nets^[51]	小	—	Leveled-FHE	—	—
CryptoNets^[52]	小	MNIST	Leveled-FHE	很大	很大
MiniONN^[53]	小	MNIST	HE+GC+SS	较大	较大
Chameleon^[54]	小	MNIST、CIFAR-10	GC+SS	较大	较大
Gazelle^[55]	小	MNIST、CIFAR-10	HE+GC+SS	较小	较小
Delphi^[56]	ResNet32	CIFAR-10、CIFAR-100	HE+GC+SS+OT	较小	较小
CrypTFlow2^[57]	SqNet、RN50、DNet121	ImageNet	SS+HE+OT	较小	较小
Cheetah^[9]	SqNet、RN50、DNet121	ImageNet	SS+HE+OT	较小	较小

机器学习安全推理研究综述

Review of Research on Secure Inference in Machine Learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 4

参考文献 69

相关文章 15

编辑推荐

Metrics

本文评价

[1]	郭学兵, 朱小杰, 唐新斋, 杨刚, 侯艳飞, 何洪林. 基于大数据流水线系统的算法模型整合方法研究——以基于机器学习方法的LiDAR数据树木生物量反演为例[J]. 数据与计算发展前沿, 2024, 6(4): 96-105.
[2]	何睿琳, 杨欣怡, 孙洪赞, 李晨. 基于图特征的组织病理学图像分析方法的最新发展情况与展望[J]. 数据与计算发展前沿, 2024, 6(2): 101-116.
[3]	叶旭, 杜一, 崔文娟, 沈俊杰, 谢靖, 王露笛. 机器学习技术在眼健康领域的应用[J]. 数据与计算发展前沿, 2024, 6(2): 117-133.
[4]	申志豪, 李娜, 尹世豪, 杜一, 胡良霖. 基于TPA-Transformer的机票价格预测[J]. 数据与计算发展前沿, 2023, 5(6): 115-125.
[5]	危婷, 彭亮, 牛铁, 张宏海. 基于特征分析的HPC失败作业的检测和根因分析[J]. 数据与计算发展前沿, 2023, 5(6): 94-103.
[6]	孙一帆, 张锐, 陶杨, 高碧柔, 秦诗涵, 安超. 本地化差分隐私综述[J]. 数据与计算发展前沿, 2023, 5(5): 74-97.
[7]	汤世源, 袁野. 基于安全多方计算的隐私保护图查询[J]. 数据与计算发展前沿, 2023, 5(5): 98-106.
[8]	田一擎, 程曦, 冯博靖. 企业信用评级计算模型综述[J]. 数据与计算发展前沿, 2023, 5(4): 139-153.
[9]	陈美霖, 刘端阳, 徐黎明, 汪洋. 基于机器学习的力场模型研究综述[J]. 数据与计算发展前沿, 2023, 5(4): 27-37.
[10]	刘端阳, 魏钟鸣. 有监督学习算法在材料科学中的应用[J]. 数据与计算发展前沿, 2023, 5(4): 38-47.
[11]	李妍,何洪波,王闰强. 微博热度预测研究综述[J]. 数据与计算发展前沿, 2023, 5(2): 119-135.
[12]	高添,朱教君,张金鑫,孙一荣,于丰源,滕德雄,卢德亮,于立忠,王宗国. 基于新一代信息技术的温带森林生态系统碳通量精准计量[J]. 数据与计算发展前沿, 2023, 5(2): 60-72.
[13]	王凡,冯立强,曹荣强. 大数据驱动的海洋人工智能服务平台设计与应用[J]. 数据与计算发展前沿, 2023, 5(2): 73-85.
[14]	赵忠斌,蔡满春,芦天亮. 融合多头注意力机制的网络恶意流量检测[J]. 数据与计算发展前沿, 2022, 4(5): 60-67.
[15]	危婷,张宏海,蔺小丽,张蕾蕾,王妍,贾金峰. 云服务网站用户复访行为预测模型研究[J]. 数据与计算发展前沿, 2022, 4(3): 124-130.