Reinforcement learning (RL) is utilized in a wide range of real-world applications. Typical applications include single agent-based RL. However, most practical tasks require multiple agents for cooperative control processes. Multiple-agent RL demands complicated design, and numerous design possibilities should be considered for its practical usefulness. We propose two RL implementations for a message-queuing telemetry transport (MQTT) protocol system. Two types of implementations improve the communication efficiency of MQTT: (i) single-broker-agent implementation and (ii) multiple-publisher-agents implementation. We focused on different message priorities in a dynamic environment for each implementation. The proposed implementations improve communication efficiency by adjusting the loop cycle time of the broker or by learning the message importance. The proposed MQTT control scheme improves the battery efficiency of Internet-of-Things (IoT)-based devices with relatively insufficient battery power.
This work was supported by the Institute for Information &and Communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No. 2016\u20100\u201000160, Versatile Network System Architecture for Multi\u2010Dimensional Diversity). This work is supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF\u20102017R1A2B1009709).