PROJECT 2

Advancing Continual and Federated Learning with Self and Mixed Supervision

Project Leader: Salman Avestimehr, Professor of ECE and CS Departments, Director of USC-Amazon Center on Trusted AI

Abstract: Federated learning (FL) is an efficient learning framework that assists distributed machine learning when data cannot be shared with a centralized server. Recent advancements in FL often assume that clients possess data annotations to participate in supervised training. However, data annotation is a resource-intensive process that involves human resources such as annotators, and time costs. In practical scenarios, not all clients can provide labeled data. For instance, hospitals in rural areas may have valuable data resources but may not afford the human resources such as expert radiologists to annotate the medical imaging data. Excluding such clients from the training process could lead to the underutilization of their valuable data resources. Hence, our primary focus lies in addressing the challenge of label unavailability in FL. More specifically, we study the research problem: How can a server leverage unannotated data clients/silos, that have no labeled data, along with a few labeled data clients/silos in a realistic non-independent, and identical (non-IID) data distribution-based FL regime to improve the global model performance (as depicted in Figure (a))? Given the heterogeneity of data distribution among clients (as illustrated in Figure (b)), combining state-of-the-art semi-supervised techniques with Federated Learning becomes challenging. This challenge arises because the domain of labeled clients may differ from that of unlabeled clients, leading to noisy or inaccurate soft labels during training for the unlabeled clients. Additionally, the source domain data from labeled clients is inaccessible to unlabeled clients, making domain adaptation for unsupervised learning a complex task. To address these issues, we aim to leverage unsupervised learning-based personalization methods in this semi-supervised FL setup to achieve source-free domain adaptation for unlabeled clients. This approach seeks to achieve source-free domain adaptation for unlabeled clients. We plan to evaluate the proposed method using naturally partitioned federated computer vision datasets.

PROJECT LEADER

Salman Avestimehr