Ajou University repository

Accelerating ML/DL Applications With Hierarchical Caching on Deduplication Storage Clustersoa mark
Citations

SCOPUS

6

Citation Export

DC Field Value Language
dc.contributor.authorHamandawana, Prince-
dc.contributor.authorKhan, Awais-
dc.contributor.authorKim, Jongik-
dc.contributor.authorChung, Tae Sun-
dc.date.issued2022-12-01-
dc.identifier.issn2332-7790-
dc.identifier.urihttps://dspace.ajou.ac.kr/dev/handle/2018.oak/32215-
dc.description.abstractLarge scale machine learning (ML) and deep learning (DL) platforms face challenges when integrated with deduplication enabled storage clusters. In the quest to achieve smart and efficient storage utilization, removal of duplicate data introduces bottlenecks, since deduplication alters the I/O transaction layout of the storage system. Therefore, it is critical to address such deduplication overhead for acceleration of ML/DL computation in deduplication storage. Existing state of the art ML/DL storage solutions such as Alluxio and AutoCache adopt non deduplication-aware caching mechanisms, which lacks the much needed performance boost when adopted in deduplication enabled ML/DL clusters. In this paper, we introduce Redup, which eliminates the performance drop caused by enabling deduplication in ML/DL storage clusters. At the core, is a Redup Caching Manager (RDCM), composed of a 2-tier deduplication layout-aware caching mechanism. The RDCM provides an abstraction of the underlying deduplication storage layout to ML/DL applications and provisions a decoupled acceleration of object reconstruction during ML/DL read operations. Our Redup evaluation shows negligible performance drop in ML/DL training performances as compared to a cluster without deduplication, whilst significantly outperforming Alluxio and AutoCache in terms of various performance metrics.-
dc.language.isoeng-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.subject.meshCaching mechanism-
dc.subject.meshDeduplication-
dc.subject.meshDeep learning-
dc.subject.meshLarge-scale machine learning-
dc.subject.meshLearning platform-
dc.subject.meshMachine-learning-
dc.subject.meshPerformance-
dc.subject.meshState of the art-
dc.subject.meshStorage systems-
dc.subject.meshStorage utilization-
dc.titleAccelerating ML/DL Applications With Hierarchical Caching on Deduplication Storage Clusters-
dc.typeArticle-
dc.citation.endPage1636-
dc.citation.startPage1622-
dc.citation.titleIEEE Transactions on Big Data-
dc.citation.volume8-
dc.identifier.bibliographicCitationIEEE Transactions on Big Data, Vol.8, pp.1622-1636-
dc.identifier.doi10.1109/tbdata.2021.3106345-
dc.identifier.scopusid2-s2.0-85113281987-
dc.identifier.urlhttps://www.ieee.org/membership-catalog/productdetail/showProductDetailPage.html?product=PER472-ELE-
dc.subject.keywordbig data-
dc.subject.keyworddeduplication-
dc.subject.keyworddeep learning-
dc.subject.keywordMachine learning-
dc.subject.keywordstorage management-
dc.description.isoatrue-
dc.subject.subareaInformation Systems-
dc.subject.subareaInformation Systems and Management-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

HAMANDAWANA PRINCE Image
HAMANDAWANA PRINCEHAMANDAWANA, PRINCE
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.