Ajou University repository

Crocus: Enabling Computing Resource Orchestration for Inline Cluster-Wide Deduplication on Scalable Storage Systems
Citations

SCOPUS

13

Citation Export

DC Field Value Language
dc.contributor.authorHamandawana, Prince-
dc.contributor.authorKhan, Awais-
dc.contributor.authorLee, Chang Gyu-
dc.contributor.authorPark, Sungyong-
dc.contributor.authorKim, Youngjae-
dc.date.issued2020-08-01-
dc.identifier.urihttps://dspace.ajou.ac.kr/dev/handle/2018.oak/31218-
dc.description.abstractInline deduplication dramatically improves storage space utilization. However, it degrades I/O throughput due to compute-intensive deduplication operations such as chunking, fingerprinting or hashing of chunk content, and redundant lookup I/Os over the network in the I/O path. In particular, the fingerprint or hash generation of content contributes largely to the degraded I/O throughput and is computationally expensive. In this article, we propose Crocus, a framework that enables compute resource orchestration to enhance cluster-wide deduplication performance. In particular, Crocus takes into account all compute resources such as local and remote {CPU, GPU} by managing decentralized compute pools. An opportunistic Load-Aware Fingerprint Scheduler (LAFS), distributes and offloads compute-intensive deduplication operations in a load-aware fashion to compute pools. Crocus is highly generic and can be adopted in both inline and offline deduplication with different storage tier configurations. We implemented Crocus in Ceph scale-out storage system. Our extensive evaluation shows that Crocus reduces the fingerprinting overhead by 86 percent with 4KB chunk size compared to Ceph with baseline deduplication while maintaining high disk-space savings. Our proposed LAFS scheduler, when tested in different internal and external contention scenarios also showed 54 percent improvement over a fixed or static scheduling approach.-
dc.description.sponsorshipThis work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea Government (Ministry of Science and ICT) under Grant NRF-2018R1A1A1A05079398.-
dc.language.isoeng-
dc.publisherIEEE Computer Society-
dc.subject.meshCompute resources-
dc.subject.meshComputing resource-
dc.subject.meshDe duplications-
dc.subject.meshDistributed file systems-
dc.subject.meshScalable storage systems-
dc.subject.meshStatic scheduling-
dc.subject.meshStorage spaces-
dc.subject.meshStorage systems-
dc.titleCrocus: Enabling Computing Resource Orchestration for Inline Cluster-Wide Deduplication on Scalable Storage Systems-
dc.typeArticle-
dc.citation.endPage1753-
dc.citation.startPage1740-
dc.citation.titleIEEE Transactions on Parallel and Distributed Systems-
dc.citation.volume31-
dc.identifier.bibliographicCitationIEEE Transactions on Parallel and Distributed Systems, Vol.31, pp.1740-1753-
dc.identifier.doi10.1109/tpds.2020.2972882-
dc.identifier.scopusid2-s2.0-85082067433-
dc.identifier.urlhttp://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=71-
dc.subject.keywordDistributed file systems-
dc.subject.keywordscheduling-
dc.subject.keywordstorage management-
dc.description.isoafalse-
dc.subject.subareaSignal Processing-
dc.subject.subareaHardware and Architecture-
dc.subject.subareaComputational Theory and Mathematics-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

HAMANDAWANA PRINCE Image
HAMANDAWANA PRINCEHAMANDAWANA, PRINCE
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.