In this paper, we propose a Gaussian Random Trajectory guided Hierarchical Reinforcement Learning (GRT-HL) method for autonomous furniture assembly. The furniture assembly problem is formulated as a comprehensive human-like long-horizon manipulation task that requires a long-term planning and a sophisticated control. Our proposed model, GRT-HL, draws inspirations from the semi-supervised adversarial autoencoders, and learns latent representations of the position trajectories of the end-effector. The high-level policy generates an optimal trajectory for furniture assembly, considering the structural limitations of the robotic agents. Given the trajectory drawn from the high-level policy, the low-level policy makes a plan and controls the end-effector. We first evaluate the performance of GRT-HL compared to the state-of-the-art reinforcement learning methods in furniture assembly tasks. We demonstrate that GRT-HL successfully solves the long-horizon problem with extremely sparse rewards by generating the trajectory for planning.
This work was supported by Samsung Electronics (IO201208-07855-01) and also by MSIT, Korea, under ITRC (IITP-2022-2017-0-01637) supervised by IITP. The authors thank to Mr. MyungJae Shin for his contribution on research initiation, during his master study under the guidance of Prof. Joongheon Kim. Soyi Jung, Jong-Kook Kim, and Joongheon Kim are the corresponding authors.