The increasing volume of live content poses new challenges to publish/subscribe services at the cloud scale. Providing efficient publish/subscribe services for live content is a complex task because most subscriptions occupy only a small portion of the entire subscription space, i.e., use limited live content. Thus, the real-world workload of a publish/subscribe service for live content becomes skewed, and the distribution of subscriptions becomes seriously imbalanced, causing an inefficient processing of events. We present a correlation-based balanced content space partitioning technique for a publish/subscribe service. Our proposed technique reduces the degree of imbalance from a skewed subscription workload in a content-based publish/subscribe service, using the correlation coefficient between attributes to build dimension groups. We assign attributes of low correlation to the same dimension group to balance the subscription workloads. Moreover, we present our analysis on the load balance impacts of varying partitioning granularity for efficient message processing. We conducted empirical experiments evaluating the effectiveness of our partitioning technique and measuring the impact of varying partitioning granularity. The results show that the proposed technique outperforms conventional partitioning techniques by evaluating the ways in which subscriptions are evenly distributed among brokers. Moreover, the results show that the load balance can be improved by increasing the partitioning granularity with an adjustment of two degrees, i.e., the segment and dimension group degrees.
This research was jointly supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01431) supervised by the IITP (Institute for Information & communications Technology Promotion), the Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858), and National Supercomputing Center with supercomputing resources including technical support (KSC-2019-CRE-0105).