A single Active Namenode (ANN) of Hadoop Distributed File System (HDFS) become a bottleneck when we require high-throughput read operations such as large-scale data analysis. Recently, various kinds of namenode schemes are proposed including asynchronous check pointing schemes to address the ANN bottleneck issue. Even if asynchronous schemes offers high throughput reading operations, they suffers in stale read problem where the latest data return is not guaranteed. In this paper, we propose a novel metadata replication scheme with synchronous OpCodes writing to achieve namenode multiplexing, where we can avoid the stale read problem. To reduce synchronization overhead, our proposed scheme conducts reduced replication only for metadata updates such as a write request, using quasi byte-level metadata operation codes. We conducted the empirical experiment to verify the effectiveness of our proposed schemes. The results show that our method reduces by 50.95% in the average required number of NNs when the number of NNs for read-only operation is 100.
This work has been supported by the Future Combat System Network Technology Research Center program of Defense Acquisition Program Administration and Agency for Defense Development.(UD190033ED)