Attention-guided adversarial example attack on image classification

Shang, Jingxiang

Author:	Shang, Jingxiang
Title:	Attention-guided adversarial example attack on image classification
Degree:	M.Sc.
Year:	2023
Department:	Department of Electrical and Electronic Engineering
Pages:	51 pages : color illustrations
Language:	English
Abstract:	Recently, deep learning network has gradually penetrated into many scenarios such as military, identity recognition, medical care, and autonomous driving. At the same time, it has become a standard technology for solving computer vision tasks. However, recent research shows that by adding carefully designed slight perturbations to input image, deep neural network (DNN) will give wrong answer with a very high degree of confidence. This problem exposes the security risks of deep learning models in practical applications. Image adversarial attacks aim to evaluate the vulnerability of DNN and guide the establishment of a more secure network model. Many scholars have studied image adversarial attacks to enhance the robustness of models, most of which are achieved through global perturbations of input images. However, both the interpretability and attention mechanism of deep learning indicate that DNN only focus on specific areas of image when making decisions, which means that global perturbations introduce redundant noise to the generated adversarial examples, thereby affecting their quality. Meanwhile, the transfer capabilities of existing methods are not good enough. In order to solve above problems, this dissertation improves adversarial attacks in image classification. The specific research contents are as follows: 1) A hard-label black-box attack method based on gradient-free optimization is proposed. Existing hard-label black-box attack methods suffer from a large number of queries for two reasons: a). The randomly generated global initial perturbation ignores the contribution of different areas in image to the classification result. b). A large number of queries are wasted when estimating gradient. Therefore, this method first utilizes the interpretability theory of neural networks and combine Grad-CAM and 2D-DWT, to generate initial perturbations closer to the decision boundary. Second, by redefining the distance between the adversarial samples and the decision boundary, the black-box attack problem is transformed into an optimization problem and solved with gradient-free optimization, which effectively reduces the number of queries to original model. After experiments, our algorithm has an attack success rate of 85% on ImageNet, and the number of queries is reduced to below, both of which outperform baseline models. 2) A high transferability attack method (Edge Attack) based on attention mechanism is proposed. Since most of existing methods attack each pixel of the image in an undifferentiated way, they ignore the significant contribution of the key areas in the image to the classification results, and the transferability is not high. Therefore, Edge Attack combines the channel and spatial attention information of the model to prioritize the destruction of areas (objects and contours of objects) that have the greatest impact on the classification results. Second, our method capitalizes on the high similarity of attention between models and high transferability. Through experimental demonstration, the success rate of white-box attack of this method is close to 100%, which is on par with the strongest white-box attack method C&W at this stage, while its transferability between different models is higher than 70%, also better than the baseline methods.
Rights:	All rights reserved
Access:	restricted access

Files in This Item:

File	Description	Size	Format
8269.pdf	For All Users (off-campus access for PolyU Staff & Students only)	2.49 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/13859