Communication Efficient and Secure Federated Learning

HANGYU  ZHU

doi:10.15126/thesis.900230

Standard centralized machine learning applications require the participants to upload their personal data to a central cloud for model training, which significantly harms the users’ privacy. And federated learning is an emerging privacy preservation machine learning paradigm proposed to alleviate this issue. It is inherently a distributed machine learning framework and enables multiple users to collaboratively train a global model without sharing their local training data, thus preventing individuals’ data from being revealed. However, federated learning often consumes much more communication resources than centralized learning, since the model parameters are required to be uploaded and downloaded between clients and the server during the training process. To mitigate this issue, we propose a multi-objective evolutionary federated learning framework to reduce the communicated model size without obvious performance drop, thus, reducing the communication costs in federated learning. In addition, the modified SET algorithm is adopted to improve the encoding scalability and further reduce the model size. This approach is intrinsically an offline evolutionary optimization framework, and the performance of the reinitialized models at each generation will dramatically degrade, making the offline optimization approach infeasible for real-time federated learning systems. Therefore, we extend our previous offline method to propose a novel real-time federated evolutionary neural architecture search (NAS) framework named RT-FedEvoNAS. By real-time or online federated evolutionary NAS, we mean that the neural network models are already in use during the search process and each client only trains one sub-model within one generation with the help of the proposed double sampling strategy. In addition, all the searched neural network models at the last generation are already well trained and have no need to be re-trained from scratch again, making the communication costs kept minimum. Another challenge is that, federated learning cannot fully guarantee the local privacy and the training data still have potential risk to be disclosed through the uploaded model parameters. To address this concern, homomorphic encryption (HE) is commonly applied in federated learning by encrypting the model parameters on each client before sending them to the server. However, most HE-based FL systems are not efficient enough and require a thrusted third party for key pair generation. Therefore, we propose a distributed additive encryption and quantization federated deep learning framework, in which the key pairs are generated between the server and clients without the help of an extra thrusted third party. In addition, ternary gradient quantization and aggregation approximation strategy are adopted to simultaneously reduce the communication and local computational costs. Different from aforementioned methods focused on training parametric models in horizontal federated learning, our last work is located at learning non-parametric models in vertical federated learning. Non-parametric models like gradient boosting decision trees (GBDTs) are commonly used in the previous work of federated learning for vertically partitioned data. But these approaches assume that all the training data labels are stored on only one (guest) client, which is not applicable in the real world applications. Therefore, we propose a secure vertical federated learning framework to train GBDT with data labels distributed on multiple devices. A novel secure protocol is proposed by setting source client and split client for node split, thus, preventing both data features and labels from being disclosed. Moreover, a partial differential privacy scheme is introduced to add Gaussian noise upon the leaf weights before sending them to the source clients for prediction protection. All the proposed methods are empirically proven using both benchmark datasets and real-world datasets. And the experimental results show that our approaches introduced in this thesis work properly for building a secure and efficient federated learning system.

Communication Efficient and Secure Federated Learning

Abstract

Files and links (1)

Metrics

Details

Communication Efficient and Secure Federated Learning

Abstract

Files and links (1)

Metrics

Details

Usage Policy