PROJECT 1

Federated Learning for Natural Language Processing

Abstract: Increasing concerns and regulations about data privacy, necessitate the study of privacy-preserving methods for natural language processing (NLP) applications. Federated learning (FL) provides promising methods for a large number of clients (i.e., personal devices or organizations) to collaboratively learn a shared global model to benefit all clients, while allowing users to keep their data locally. To facilitate FL research in NLP, we aim to develop a research toolkit for federated learning in NLP, namely FedNLP. We want to use FedNLP as both a standard benchmarking platform and a research library for developing new FL methods for NLP tasks, supporting various popular Transformer-based models. On top of FedNLP, we will study two research questions: 1) analyzing and mitigating privacy leakage during FL for NLP models, and 2) unifying client models of different model architectures. Both questions are important in applying FL methods in realistic scenarios to improve the trustworthiness of AI applications. We believe our research would open intriguing and exciting future research directions aimed at developing FL methods suited to NLP tasks.

 
project1.png