Ajou University repository

Crocus: Enabling Computing Resource Orchestration for Inline Cluster-Wide Deduplication on Scalable Storage Systems
Citations

SCOPUS

13

Citation Export

Publication Year
2020-08-01
Publisher
IEEE Computer Society
Citation
IEEE Transactions on Parallel and Distributed Systems, Vol.31, pp.1740-1753
Keyword
Distributed file systemsschedulingstorage management
Mesh Keyword
Compute resourcesComputing resourceDe duplicationsDistributed file systemsScalable storage systemsStatic schedulingStorage spacesStorage systems
All Science Classification Codes (ASJC)
Signal ProcessingHardware and ArchitectureComputational Theory and Mathematics
Abstract
Inline deduplication dramatically improves storage space utilization. However, it degrades I/O throughput due to compute-intensive deduplication operations such as chunking, fingerprinting or hashing of chunk content, and redundant lookup I/Os over the network in the I/O path. In particular, the fingerprint or hash generation of content contributes largely to the degraded I/O throughput and is computationally expensive. In this article, we propose Crocus, a framework that enables compute resource orchestration to enhance cluster-wide deduplication performance. In particular, Crocus takes into account all compute resources such as local and remote {CPU, GPU} by managing decentralized compute pools. An opportunistic Load-Aware Fingerprint Scheduler (LAFS), distributes and offloads compute-intensive deduplication operations in a load-aware fashion to compute pools. Crocus is highly generic and can be adopted in both inline and offline deduplication with different storage tier configurations. We implemented Crocus in Ceph scale-out storage system. Our extensive evaluation shows that Crocus reduces the fingerprinting overhead by 86 percent with 4KB chunk size compared to Ceph with baseline deduplication while maintaining high disk-space savings. Our proposed LAFS scheduler, when tested in different internal and external contention scenarios also showed 54 percent improvement over a fixed or static scheduling approach.
Language
eng
URI
https://dspace.ajou.ac.kr/dev/handle/2018.oak/31218
DOI
https://doi.org/10.1109/tpds.2020.2972882
Fulltext

Type
Article
Funding
This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea Government (Ministry of Science and ICT) under Grant NRF-2018R1A1A1A05079398.
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

HAMANDAWANA PRINCE Image
HAMANDAWANA PRINCEHAMANDAWANA, PRINCE
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.