Local Superior Soups: A Catalyst for Model Merging in Cross-Silo Federated Learning

By Minghui Chen, Meirui Jiang, Xin Zhang, Qi Dou, Zehua Wang, Xiaoxiao Li in NeurIPS

October 15, 2024

Authors: Minghui Chen, Meirui Jiang, Xin Zhang, Qi Dou, Zehua Wang, Xiaoxiao Li

Published in: The Thirty-Eighth Conference on Neural Information Processing Systems (NeurIPS 2024)

Abstract

Federated learning (FL) is a learning paradigm that enables collaborative training of models using decentralized data. Recently, the utilization of pre-trained weight initialization in FL has been demonstrated to effectively improve model performance. However, the evolving complexity of current pre-trained models, characterized by a substantial increase in parameters, markedly intensifies the challenges associated with communication rounds required for their adaptation to FL. To address these communication cost issues and increase the performance of pre-trained model adaptation in FL, we propose an innovative model interpolation-based local training technique called “Local Superior Soups.” Our method enhances local training across different clients, encouraging the exploration of a connected low-loss basin within a few communication rounds through regularized model interpolation. This approach acts as a catalyst for the seamless adaptation of pre-trained models in in FL. We demonstrated its effectiveness and efficiency across diverse widely-used FL datasets.

Posted on:
October 15, 2024
Length:
1 minute read, 169 words
Categories:
NeurIPS
Tags:
Federated Learning Machine Learning Pre-trained Models Model Interpolation Transfer Learning
See Also:
Can Textual Gradient Work in Federated Learning?
FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation