Author: | Zhang, Rui |
Title: | Unveiling the shadows : investigating privacy risks of gradient leakage in federated learning |
Advisors: | Li, Ping (COMP) Guo, Song (COMP) |
Degree: | Ph.D. |
Year: | 2025 |
Department: | Department of Computing |
Pages: | xvii, 132 pages : color illustrations |
Language: | English |
Abstract: | Driven by growing concerns about data privacy and the need for powerful machine learning models trained on diverse datasets, Federated Learning (FL) has emerged as a promising solution. This decentralized approach enables the training of global models without compromising individual user privacy. In FL, participating clients collaboratively train a shared model by uploading gradients calculated on their local private data. Hence, the raw data remains confined to the clients' devices. However, this seemingly secure systems are not impervious to attack. Adversaries can exploit the shared gradients through Gradient Inversion Attacks (GIAs) to disclose sensitive information about the training data, thereby threatening the privacy of FL clients. Traditional GIAs employ an optimization-based approach to recover private training data used in FL. The process begins by creating synthetic data points generated by sampling from a Gaussian distribution for both input features and the corresponding labels. The attackers feed this synthetic data into the FL shared model and calculates the gradients of the model's output with respect to these artificial inputs. Through iterative optimization, the synthetic data is continuously refined. The distance between the gradients produced by the synthetic data and the actual gradients obtained from FL clients is a measure of accuracy. This distance guides adjustments to the synthetic data, aiming to minimize the difference. When the synthetic and the actual gradients become nearly indistinguishable, the reconstructed synthetic data approximates the private training data used in FL collaboration. Building upon existing research on GIAs, this thesis delves into the limitations of current methods from an adversarial perspective. We identify three critical challenges hindering the development of effective and efficient data reconstruction mechanisms. (1) Label Information Leakage: Access to accurate labels is essential for reconstructing training samples. Existing methods either assume adversaries possess this knowledge or rely on some restrictions (e.g., non-negative activations) to circumvent this issue. (2) Layer-wise Gradient Exploitation: While some studies utilize gradients from fully connected layers for direct input recovery, the broader potential of exploiting layer-wise gradient relationships remains largely unexplored. Deeper exploration in this area is crucial for advancing attack effectiveness. (3) Targeted Data Reconstruction: Most GIAs focus on reconstructing comprehensive data from deep networks and large batches. However, the threat escalates when an adversary can selectively reconstruct specific data samples from gradients. This precision allows for more accurate privacy leakage and increases potential misuse. In this thesis, we aim to advance the field of FL security and privacy by developing novel gradient-induced attacks that address the aforementioned challenges. Our main contributions are threefold, detailed in the following three parts. In the first part, we establish the fundamental connection between training labels and gradients. Building upon this discovery, we introduce a versatile framework for label recovery applicable to diverse classification tasks. Our framework leverages the posterior probabilities - generated by functions like Sigmoid or Softmax - to infer labels in binary, multi-class, and even imbalanced scenarios. In particular, exploiting the globally shared model in FL, an adversary can estimate these posterior probabilities for training samples by using some auxiliary data. By incorporating these estimates into the relationship derived between labels and gradients, we can effectively recover the batch training labels from shard gradients. In the second part, we propose the Gradient Bridge (GDBR) attack to expose privacy leakage through correlated layer-wise gradients in specific gradient-sharing scenarios. GDBR begins by deriving theoretical relationships between gradients across different layer types: input-output and weight-output for fully connected and convolutional layers, respectively, and output with respect to activation functions. By meticulously tracking gradient flow across these layers, we formulate a recursive procedure that reconstructs the gradient of the model's output logits. By associating the reconstructed logit gradients with observable variables (e.g., hidden features, Softmax probabilities), GDBR can recover label information from training samples even when only a single layer's gradients at the bottom of the model are shared. In the third part, we focus on reconstructing training data from sensitive or specified classes by devising a targeted attack called Gradient Filtering (GradFilt). GradFilt operates under the assumption of a malicious adversary capable of manipulating both the parameters and structure of the white-box FL model. This attack commences by strategically modifying the weights and biases of the final fully connected layer, effectively zeroing out gradients corresponding to non-target data while preserving those associated with the target class. This grants GradFilt control over the model's output probabilities, enabling it to recover labels for the entire batch and determine the number of instances in the target category. Finally, GradFilt reconstructs the target data by applying an optimization-based or analytical approach, depending on the number of target samples included in the training batch. In summary, this thesis illuminates the inherent vulnerabilities within the mechanisms of gradient sharing, underscoring the critical need for ongoing vigilance in protecting client privacy within FL systems. We firmly believe that our findings will significantly contribute to the development of more robust and trustworthy FL solutions, ultimately unlocking the full potential of collaborative machine learning across a diverse range of applications. |
Rights: | All rights reserved |
Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/13803