Staleness aware semi-asynchronous federated learning

Yu, Miri; Choi, Jiheon; Lee, Jaehyun; Oh, Sangyoon

Publication Year: 2024-11-01

Publisher: Academic Press Inc.

Citation: Journal of Parallel and Distributed Computing, Vol.193

Keyword: Federated learning Semi-asynchronous Staleness

Mesh Keyword: Asynchronous protocols Federated learning Global models High-accuracy Local model Protocol cans Semi-asynchronoi Staleness Synchronous protocols Training efficiency

All Science Classification Codes (ASJC): Software Theoretical Computer Science Hardware and Architecture Computer Networks and Communications Artificial Intelligence

Abstract: As the attempts to distribute deep learning using personal data have increased, the importance of federated learning (FL) has also increased. Attempts have been made to overcome the core challenges of federated learning (i.e., statistical and system heterogeneity) using synchronous or asynchronous protocols. However, stragglers reduce training efficiency in terms of latency and accuracy in each protocol, respectively. To solve straggler issues, a semi-asynchronous protocol that combines the two protocols can be applied to FL; however, effectively handling the staleness of the local model is a difficult problem. We proposed SASAFL to solve the training inefficiency caused by staleness in semi-asynchronous FL. SASAFL enables stable training by considering the quality of the global model to synchronise the servers and clients. In addition, it achieves high accuracy and low latency by adjusting the number of participating clients in response to changes in global loss and immediately processing clients that did not to participate in the previous round. An evaluation was conducted under various conditions to verify the effectiveness of the SASAFL. SASAFL achieved 19.69%p higher accuracy than the baseline, 2.32 times higher round-to-accuracy and 2.24 times higher latency-to-accuracy. Additionally, SASAFL always achieved target accuracy that the baseline can't reach.

ISSN: 0743-7315

Language: eng

URI: https://dspace.ajou.ac.kr/dev/handle/2018.oak/34304

DOI: https://doi.org/10.1016/j.jpdc.2024.104950

Fulltext

Type: Article

Funding: The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Sangyoon Oh reports financial support was provided by Institute of Information and Communications Technology Planning and Evaluation (IITP). Miri Yu reports equipment, drugs, or supplies was provided by Korea Institute of Science and Technology Information (KISTI). Sangyoon Oh reports a relationship with National Science Foundation of Korea (NRF-Korea) that includes: consulting or advisory and funding grants. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.This work was jointly supported by the Korea Institute of Science and Technology Information (KISTI) (KSC2022-CRE-0406), NRF-Korea grant funded by the Korea government (MSIT) (RS-2023-00283799), and Artificial Intelligence Convergence Innovation Human Resources Development by the Institute of Information and Communications Technology Planning and Evaluation (IITP-2023-No.RS-2023-00255968). The authors would like to thank Editage (www.editage.co.kr) for English language editing.This work was jointly supported by the Korea Institute of Science and Technology Information (KISTI) (KSC2022-CRE-0406) and IITP-2023-No.RS-2023-00255968, Artificial Intelligence Convergence Innovation Human Resources Development. The authors would like to thank Editage (www.editage.co.kr) for English language editing.

Show full item record

qrcode

트윗하기

Related Researcher

Oh, Sangyoon오상윤: Department of Software and Computer Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download