Author: Li, Zijian
Title: Federated learning with GAN-based data synthesis for non-IID clients
Advisors: Zhang, Jun (EIE)
Mao, Yuyi (EIE)
Degree: M.Sc.
Year: 2022
Subject: Machine learning
Federated database systems
Application software
Hong Kong Polytechnic University -- Dissertations
Department: Department of Electronic and Information Engineering
Pages: iv, 28 pages : color illustrations
Language: English
Abstract: Federated learning (FL) has recently emerged as a popular privacy-preserving collaborative learning paradigm. However, it suffers from the non-IID (independent and identically distributed) data among clients. In this paper, we propose a novel framework, namely Synthetic Data Aided Federated Learning (SDA-FL), to resolve the non-IID issue by sharing differentially private synthetic data. Specifically, each client pretrains a local generative adversarial network (GAN) to generate synthetic data, which are uploaded to the parameter server (PS) to construct a global shared synthetic dataset. The PS is responsible for generating and updating high-quality labels for the global dataset via pseudo labeling with a confident threshold before each global aggregation. A combination of the local private dataset and labeled synthetic dataset leads to nearly identical data distributions among clients, which improves the consistency among local models and benefits the global aggregation. To ensure privacy, the local GANs are trained with differential privacy by adding artificial noise to the local model gradients before being uploaded to the PS. Extensive experiments evidence that the proposed framework outperforms the baseline methods by a large margin in several benchmark datasets under both the supervised and semi-supervised settings.
