Ajou University repository

SAGE: toward on-the-fly gradient compression ratio scaling
Citations

SCOPUS

0

Citation Export

DC Field Value Language
dc.contributor.authorYoon, Daegun-
dc.contributor.authorJeong, Minjoong-
dc.contributor.authorOh, Sangyoon-
dc.date.issued2023-07-01-
dc.identifier.urihttps://dspace.ajou.ac.kr/dev/handle/2018.oak/33259-
dc.description.abstractGradient sparsification is widely adopted in distributed training; however, it suffers from a trade-off between computation and communication. The prevalent Top-k sparsifier has a hard constraint on computational overhead while achieving the desired gradient compression ratio. Conversely, the hard-threshold sparsifier eliminates computational constraints but fail to achieve the targeted compression ratio. Motivated by this tradeoff, we designed a novel threshold-based sparsifier called SAGE, which achieves a compression ratio close to that of the Top-k sparsifier with negligible computational overhead. SAGE scales the compression ratio by deriving an adjustable threshold based on each iteration’s heuristics. Experimental results show that SAGE achieves a compression ratio closer to the desired ratio than a hard-threshold sparsifier without exacerbating the accuracy of model training. In terms of computation time for gradient selection, SAGE achieves a speedup of up to 23.62 × over the Top-k sparsifier.-
dc.description.sponsorshipThis work was jointly supported by the BK21 FOUR program (NRF5199991014091), the Basic Science Research Program (2022R1F1A1062779) of National Research Foundation (NRF) of Korea, the Korea Institute of Science and Technology Information (KISTI) (TS-2022-RE-0019), and (KSC-2022-CRE-0406).-
dc.language.isoeng-
dc.publisherSpringer-
dc.subject.meshCommunication optimization-
dc.subject.meshCompression ratio scaling-
dc.subject.meshComputational constraints-
dc.subject.meshComputational overheads-
dc.subject.meshDistributed deep learning-
dc.subject.meshGradient sparsification-
dc.subject.meshHard constraints-
dc.subject.meshScalings-
dc.subject.meshSparsification-
dc.subject.meshTrade off-
dc.titleSAGE: toward on-the-fly gradient compression ratio scaling-
dc.typeArticle-
dc.citation.endPage11409-
dc.citation.startPage11387-
dc.citation.titleJournal of Supercomputing-
dc.citation.volume79-
dc.identifier.bibliographicCitationJournal of Supercomputing, Vol.79, pp.11387-11409-
dc.identifier.doi10.1007/s11227-023-05120-7-
dc.identifier.scopusid2-s2.0-85148905546-
dc.identifier.urlhttps://www.springer.com/journal/11227-
dc.subject.keywordCommunication optimization-
dc.subject.keywordCompression ratio scaling-
dc.subject.keywordDistributed deep learning-
dc.subject.keywordGradient sparsification-
dc.description.isoafalse-
dc.subject.subareaTheoretical Computer Science-
dc.subject.subareaSoftware-
dc.subject.subareaInformation Systems-
dc.subject.subareaHardware and Architecture-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Oh, Sangyoon Image
Oh, Sangyoon오상윤
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.