[1] |
ZHANG J, YI Q, SANG J. Towards adversarial attack on vision-language pre-training models[C]// Proceedings of the ACM International Conference on Multimedia, 2022: 5005-5013.
|
[2] |
LU D, WANG Z, WANG T, et al. Set-level guidance attack: Boosting adversarial transferability of vision-language pre-training models[C]// Proceedings of the IEEE International Conference on Computer Vision, 2023: 102-111.
|
[3] |
REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
|
[4] |
JIANG H Z, MISRA I, ROHRBACH M, et al. In defense of grid features for visual question answering[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 10267-10276.
|
[5] |
ZHOU L, PALANGI H, ZHANG L, et al. Unified vision-language pre-training for image captioning and vqa[C]// Proceedings of the AAAI conference on artificial intelligence, 2020, 34(7): 13041-13049.
|
[6] |
DOU Z Y, XU Y, GAN Z, et al. An empirical study of training end-to-end vision-and-language transformers[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022: 18166-18176.
|
[7] |
TAYLOR W L. “Cloze procedure”: A new tool for measuring readability[J]. Journalism Quarterly, 1953, 30(4): 415-433.
|
[8] |
CHEN Y C, LI L, YU L, et al. Uniter: Universal image-text representation learning[C]// Proceedings of the European Conference on Computer Vision, 2020: 104-120.
|
[9] |
LI J, SELVARAJU R, GOTMARE A, et al. Align before fuse: Vision and language representation learning with momentum distillation[J]. Advances in Neural Information Processing Systems, 2021, 34: 9694-9705.
|
[10] |
WANG Z, YU J, YU A W, et al. Simvlm: Simple visual language model pretraining with weak supervision[J]. arXiv preprint arXiv: 2021, 2108.10904.
|
[11] |
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]// Proceedings of the European Conference on Computer Vision, 2014: 740-755.
|
[12] |
KRISHNA R, ZHU Y, GROTH O, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations[J]. International Journal of Computer Vision, 2017, 123: 32-73.
|
[13] |
SHARMA P, DING N, GOODMAN S, et al. Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers), 2018: 2556-2565..
|
[14] |
CHANGPINGYO S, SHARMA P, DING N, et al. Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 3558-3568.
|
[15] |
KAY W, CARREIRA J, SIMONYAN K, et al. The kinetics human action video dataset[J]. arXiv preprint arXiv: 2017, 1705.06950.
|
[16] |
LU J, BATRA D, PARIKH D, et al. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks[J]. Advances in Neural Information Processing Systems, 2019, 32.
|
[17] |
LI X J, YIN X, LI C Y, et al. OSCAR: Object-semantics aligned pretraining for vision-language tasks[C]. In Proceedings of the 16th European Conference on Computer Vision, 2020: 121-137.
|
[18] |
SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[OL]// arXiv preprint arXiv, 2013: 1312.6199.
|
[19] |
DONG Y, PANG T, SU H, et al. Evading defenses to transferable adversarial examples by translation-invariant attacks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019: 4312-4321.
|
[20] |
LIN J, SONG C, HE K, et al. Nesterov accelerated gradient and scale invariance for adversarial attacks[J]. arXiv preprint arXiv, 2019: 1908.06281.
|
[21] |
WU W B, SU Y X, LYU M R, et al. Improving the transferability of adversarial samples with adversarial transformations[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 9024-9033.
|
[22] |
LI Y, BAI S, XIE C, et al. Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses[M]//Computer Vision-ECCV 2020, Lecture Notes in Computer Science, 2020: 795-813.
|
[23] |
BYUN J, CHO S, KWON M J, et al. Improving the transferability of targeted adversarial examples through object-based diverse input[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022: 15244-15253.
|
[24] |
WANG X, HE X, WANG J, et al. Admix: Enhancing the transferability of adversarial attacks[C]// Proceedings of the IEEE International Conference on Computer Vision, 2021: 16158-16167.
|
[25] |
DONG Y, LIAO F, Pang T, et al. Boosting adversarial attacks with momentum[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 9185-9193.
|
[26] |
XIONG Y, LIN J, ZHANG M, et al. Stochastic variance reduced ensemble adversarial attack for boosting the adversarial transferability[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022: 14983-14992.
|
[27] |
ZHU H, REN Y, SUI X, et al. Boosting adversarial transferability via gradient relevance attack[C]// Proceedings of the IEEE International Conference on Computer Vision, 2023: 4741-4750.
|
[28] |
LI M, DENG C, LI T, et al. Towards transferable targeted attack[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020: 641-649.
|
[29] |
QIN Z, FAN Y, LIU Y, et al. Boosting the transferability of adversarial attacks with reverse adversarial perturbation[J]. Advances in Neural Information Processing Systems, 2022, 35: 29845-29858.
|
[30] |
MA W, LI Y, JIA X, et al. Transferable adversarial attack for both vision transformers and convolutional networks via momentum integrated gradients[C]// Proceedings of the IEEE International Conference on Computer Vision, 2023: 4630-4639.
|
[31] |
ZHANG C, BENZ P, KARJAUV A, et al. Investigating top-k white-box and transferable black-box attack[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022: 15085-15094.
|
[32] |
XIAO Z, GAO X, FU C, et al. Improving transferability of adversarial patches on face recognition with generative models[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021: 11845-11854.
|
[33] |
LI Q Z, GUO Y W, ZUO W M, et al. Making substitute models more bayesian can enhance transferability of adversarial examples[J]. arXiv preprint arXiv, 2023: 2302. 05086.
|
[34] |
ZHAO Z, LIU Z, LARSON M. On success and simplicity: A second look at transferable targeted attacks[J]. Advances in Neural Information Processing Systems, 2021, 34: 6115-6128.
|
[35] |
FANG S, LI J, LIN X, et al. Learning to learn transferable attack[C]// Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(1): 571-579.
|
[36] |
XU X, ZHANG J Y, MA E, et al. Adversarially robust models may not transfer better: Sufficient conditions for domain transferability from the view of regularization[C]// International Conference on Machine Learning, 2022: 24770-24802.
|
[37] |
QIAN Y, HE S, ZHAO C, et al. Lea2: A lightweight ensemble adversarial attack via non-overlapping vulnerable frequency regions[C]// Proceedings of the IEEE International Conference on Computer Vision, 2023: 4510-4521.
|
[38] |
CHEN B, YIN J, CHEN S, et al. An adaptive model ensemble adversarial attack for boosting adversarial transferability[C]// Proceedings of the IEEE International Conference on Computer Vision, 2023: 4489-4498.
|
[39] |
HUANG Q, KATSMAN I, He H, et al. Enhancing adversarial example transferability with an intermediate level attack[C]// Proceedings of the IEEE International Conference on Computer Vision, 2019: 4733-4742.
|
[40] |
GUO Y, LI Q, CHEN H. Backpropagating linearly improves transferability of adversarial examples[J]. Advances in Neural Information Processing Systems, 2020, 33: 85-95.
|
[41] |
GUBRI M, CORDY M, PAPADAKIS M, et al. Lgv: Boosting adversarial example transferability from large geometric vicinity[C]// Proceedings of the European Conference on Computer Vision, 2022: 603-618.
|
[42] |
POURSAEED O, Katsman I, Gao B, et al. Generative adversarial perturbations[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4422-4431.
|
[43] |
NASSER M, KHAN S, HAYAT M, et al. On generating transferable targeted perturbations[C]// Proceedings of the IEEE International Conference on Computer Vision, 2021: 7708-7717.
|
[44] |
KIM W J, HONG S, YOON S E. Diverse generative perturbations on attention space for transferable adversarial attacks[C]// IEEE International Conference on Image Processing, 2022: 281-285.
|
[45] |
FENG Y, WU B, FAN Y, et al. Boosting black-box attack with partially transferred conditional adversarial distribution[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022: 15095-15104.
|
[46] |
ZHAO A, CHU T, LIU Y, et al. Minimizing maximum model discrepancy for transferable black-box targeted attacks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023: 8153-8162.
|
[47] |
YANG X, DONG Y, PANG T, et al. Boosting transferability of targeted adversarial examples via hierarchical generative networks[C]// Proceedings of the European Conference on Computer Vision, 2022: 725-742.
|
[48] |
PLUMMER B A, WANG L, CERVANTES C M, et al. Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models[C]// Proceedings of the IEEE International Conference on Computer Vision, 2015: 2641-2649.
|
[49] |
WANG K, HE X, WANG W, et al. Boosting adversarial transferability by block shuffle and rotation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024: 24336-24346.
|