Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hamandawana, Prince | - |
dc.contributor.author | Khan, Awais | - |
dc.contributor.author | Lee, Chang Gyu | - |
dc.contributor.author | Park, Sungyong | - |
dc.contributor.author | Kim, Youngjae | - |
dc.date.issued | 2020-08-01 | - |
dc.identifier.uri | https://dspace.ajou.ac.kr/dev/handle/2018.oak/31218 | - |
dc.description.abstract | Inline deduplication dramatically improves storage space utilization. However, it degrades I/O throughput due to compute-intensive deduplication operations such as chunking, fingerprinting or hashing of chunk content, and redundant lookup I/Os over the network in the I/O path. In particular, the fingerprint or hash generation of content contributes largely to the degraded I/O throughput and is computationally expensive. In this article, we propose Crocus, a framework that enables compute resource orchestration to enhance cluster-wide deduplication performance. In particular, Crocus takes into account all compute resources such as local and remote {CPU, GPU} by managing decentralized compute pools. An opportunistic Load-Aware Fingerprint Scheduler (LAFS), distributes and offloads compute-intensive deduplication operations in a load-aware fashion to compute pools. Crocus is highly generic and can be adopted in both inline and offline deduplication with different storage tier configurations. We implemented Crocus in Ceph scale-out storage system. Our extensive evaluation shows that Crocus reduces the fingerprinting overhead by 86 percent with 4KB chunk size compared to Ceph with baseline deduplication while maintaining high disk-space savings. Our proposed LAFS scheduler, when tested in different internal and external contention scenarios also showed 54 percent improvement over a fixed or static scheduling approach. | - |
dc.description.sponsorship | This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea Government (Ministry of Science and ICT) under Grant NRF-2018R1A1A1A05079398. | - |
dc.language.iso | eng | - |
dc.publisher | IEEE Computer Society | - |
dc.subject.mesh | Compute resources | - |
dc.subject.mesh | Computing resource | - |
dc.subject.mesh | De duplications | - |
dc.subject.mesh | Distributed file systems | - |
dc.subject.mesh | Scalable storage systems | - |
dc.subject.mesh | Static scheduling | - |
dc.subject.mesh | Storage spaces | - |
dc.subject.mesh | Storage systems | - |
dc.title | Crocus: Enabling Computing Resource Orchestration for Inline Cluster-Wide Deduplication on Scalable Storage Systems | - |
dc.type | Article | - |
dc.citation.endPage | 1753 | - |
dc.citation.startPage | 1740 | - |
dc.citation.title | IEEE Transactions on Parallel and Distributed Systems | - |
dc.citation.volume | 31 | - |
dc.identifier.bibliographicCitation | IEEE Transactions on Parallel and Distributed Systems, Vol.31, pp.1740-1753 | - |
dc.identifier.doi | 10.1109/tpds.2020.2972882 | - |
dc.identifier.scopusid | 2-s2.0-85082067433 | - |
dc.identifier.url | http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=71 | - |
dc.subject.keyword | Distributed file systems | - |
dc.subject.keyword | scheduling | - |
dc.subject.keyword | storage management | - |
dc.description.isoa | false | - |
dc.subject.subarea | Signal Processing | - |
dc.subject.subarea | Hardware and Architecture | - |
dc.subject.subarea | Computational Theory and Mathematics | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.