联合图像生成和图像重构的对抗样本检测方法

李宝平; 夏瑜昊

文章摘要

李宝平,夏瑜昊.联合图像生成和图像重构的对抗样本检测方法[J].高技术通讯(中文),2025,35(5):490~501

联合图像生成和图像重构的对抗样本检测方法

Adversarial example detection method based on image generation and image reconstruction technology

DOI：10. 3772 / j. issn. 1002-0470. 2025. 05. 005

中文关键词: 对抗样本；图像分类； Swin-Transformer；图像重构；卷积神经网络

英文关键词: adversarial example, image classification, Swin-Transformer, image reconstruction, convolutional neural network (CNN)

基金项目:

作者	单位
李宝平	（河南理工大学物理与电子信息学院焦作 454000）
夏瑜昊

摘要点击次数: 1750

全文下载次数: 2378

中文摘要:

对抗样本攻击是识别网络面临的主要安全威胁之一。针对对抗样本检测过程中由分类边界模糊导致识别准确率低及需大量对抗样本参与训练导致模型收敛速率慢等问题，本文提出一种联合图像重构技术和图像生成技术实现的对抗样本检测方法。首先，设计由卷积层和Swin-Transformer联合实现的图像重构网络，还原图像的语义信息并消除对抗性扰动；然后，利用条件生成式对抗网络依据标签信息生成对应类别图像；最后，将重构图像和生成图像送至卷积识别网络，依据分类结果一致性判断是否为对抗样本。该方法将对抗样本检测问题转化为图像分类问题，无需对抗样本参与训练，无需先验地了解攻击者的攻击类型和被攻击模型的结构和参数即可直接检测对抗样本。在VGG-16、ResNet-18、GoogLeNet分类网络和MNIST、GTSRB数据集上的实验结果表明，该检测方法相较于其他经典检测方法，平均识别准确率提升了4.75%~22.86%，F1分数提升了3.40%~13.64%，证明了其优越性。

英文摘要:

Adversarial example attack is one of the main security threats to recognition networks. In order to solve the problems of low detection accuracy caused by blurred classification boundaries and slow training convergence rates resulting from the participation of a large number of adversarial samples, a method of adversarial example detection based on image reconstruction and image generation technologies is proposed. Firstly, an image reconstruction network implemented by convolutional layer and Swin-Transformer is designed to restore the semantic information and remove the adversarial noise. Then, a conditional generative adversarial network is used to generate images according to image classification label information. Finally, the reconstructed and generated images are input into the convolutional recognition network for classification, and the consistency of classification result is used to determine whether the input image is an adversarial example. The detection method converts the adversarial sample detection problem into an image classification problem, it does not need adversarial samples to participate in model training, and it is not necessary to know the attacker’s attack types, the attacked model’s structure and parameters in advance. Experimental results on VGG-16, ResNet-18, GoogLeNet classification network, MNIST and GTSRB data sets show that compared with other classical detection methods, the average recognition accuracy of this detection method is increased by 4.75%-22.86%, F1 score is increased by 3.40%-13.64%, which proves its superiority.

查看全文查看/发表评论下载PDF阅读器

关闭