Conventional deduplication systems face critical challenges such as excessive write amplification, high read/write latency, and sub-optimal storage utilization. These limitations often undermine the performance benefits of deduplication by slowing down I/O acknowledgements due to amplified deduplication I/Os, excessive data chunk replication, and strict consistency requirements. To address these issues, we present Speed-Dedup, a novel deduplication framework that employs a deduplicated primary–semi-deduplicated replica object approach. This strategy reduces write amplification by restricting deduplication to the primary object while maintaining a semi-deduplicated replica object used for immediate read/write acknowledgements, thus enhancing I/O latency and storage efficiency. Speed-Dedup also replaces traditional strong consistency models with eventual consistency, allowing for non-blocking read operations and improving overall system throughput. Experimental results demonstrate that Speed-Dedup significantly outperforms traditional methods like GRATE and CAO, showing up to 21% improvement in I/O performance under low deduplication ratios and maintaining 14% or more gains under higher ratios. Additionally, write amplification is substantially reduced and latency improves by over 100% with faster recovery times during system failures. These findings highlight the effectiveness of Speed-Dedup as a scalable and efficient solution.
This work was supported by Institute of Information and Communications Technology Planning and Evaluation (IITP) under the Artificial Intelligence Convergence Innovation Human Resources Development (IITP-2024-RS-2023-00255968) grant and the ITRC (Information Technology Research Center) support program (IITP-2021-0-02051) funded by the Korea government (MSIT). Additionally, this work was supported by the BK21 FOUR program of the National Research Foundation of Korea funded by the Ministry of Education (NRF5199991014091).