Frontiers of Data and Computing ›› 2025, Vol. 7 ›› Issue (3): 48-66.

CSTR: 32002.14.jfdc.CN10-1649/TP.2025.03.005

doi: 10.11871/jfdc.issn.2096-742X.2025.03.005

• Special Issue: 30th Anniversary of the Computer Network Information Center, Chinese Academy of Sciences • Previous Articles     Next Articles

A Review of the Research and Application of Privacy Risk Assessment Techniques for Generative Models Based on Inference Attacks

ZHANG Ninghui1,2(),LONG Chun1,*(),WAN Wei1,LI Jing1,YANG Fan1,WEI Jinxia1,FU Yuhao1   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
    2. University of Chinese Academy of Sciences, Beijing 100190, China
  • Received:2025-04-28 Online:2025-06-20 Published:2025-06-25

Abstract:

[Objective] To systematically sort out the research progress and application status of privacy risk assessment techniques for generative models based on inference attacks. [Literature Scope] this paper has surveyed more than 70 pieces of literature from mainstream conferences and journals between 2015 and 2024. [Methods] From the technical dimension, the core classification basis is the assumptions of black-box and white-box conditions. Under the assumptions of black-box and white-box conditions, a detailed summary is made by further classifying the attack methods for each type of generative model. From the application dimension, the focus is on the comparison of privacy risk assessment framework solutions for synthetic data. [Results] The existing research on attack technologies is relatively complete. However, it has a high degree of coupling with the types of models and is limited by the accuracy rate in black-box scenarios, resulting in limitations in terms of universality and accuracy of the assessment framework for the privacy risks of synthetic data in practical applications. [Conclusion] Compared with current reviews in the same research direction, this paper for the first time summarizes the latest achievements of membership inference attacks on large language models and simultaneously conducts a comparative analysis of the current latest privacy risk assessment frameworks for synthetic data. Through a summary and analysis from both dimensions of technology and application, it provides valuable references and guidance for researchers in this direction.

Key words: generative model, membership inference attack, attribute inference attack, privacy risk assessment