Optimizing Reinforcement Learning Control Model in Furuta Pendulum and Transferring it to Real-World

Hong, Myung Rae; Kang, Sanghun; Lee, Jingoo; Seo, Sungchul; Han, Seungyong; Koh, Je Sung; Kang, Daeshik

DC Field	Value	Language
dc.contributor.author	Hong, Myung Rae	-
dc.contributor.author	Kang, Sanghun	-
dc.contributor.author	Lee, Jingoo	-
dc.contributor.author	Seo, Sungchul	-
dc.contributor.author	Han, Seungyong	-
dc.contributor.author	Koh, Je Sung	-
dc.contributor.author	Kang, Daeshik	-
dc.date.issued	2023-01-01	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://dspace.ajou.ac.kr/dev/handle/2018.oak/33636	-
dc.description.abstract	Reinforcement learning does not require explicit robot modeling as it learns on its own based on data, but it has temporal and spatial constraints when transferred to real-world environments. In this research, we trained a balancing Furuta pendulum problem, which is difficult to model, in a virtual environment (Unity) and transferred it to the real world. The challenge of the balancing Furuta pendulum problem is to maintain the pendulum's end effector in a vertical position. We resolved the temporal and spatial constraints by performing reinforcement learning in a virtual environment. Furthermore, we designed a novel reward function that enabled faster and more stable problem-solving compared to the two existing reward functions. We validate each reward function by applying it to the soft actor-critic (SAC) and proximal policy optimization (PPO). The experimental result shows that cosine reward function is trained faster and more stable. Finally, SAC algorithm model using a cosine reward function in the virtual environment is an optimized controller. Additionally, we evaluated the robustness of this model by transferring it to the real environment.	-
dc.description.sponsorship	This work was supported in part by the Ajou University research fund and in part by the National Research Foundation of Korea (NRF) grant funded by the Korea Ministry of Science and ICT (MSIT) (2022R1A2C2093100) and in part by Korea Environment Industry & Technology Institute (KEITI) through Digital Infrastructure Building Project for Mon toring, Surveying and Evaluating the Environmental Health Program, funded by Korea Ministry of Environment (MOE) (2021003330009).	-
dc.language.iso	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.subject.mesh	Furuta pendulum	-
dc.subject.mesh	Inverted pendulum	-
dc.subject.mesh	Inverted pendulum problem	-
dc.subject.mesh	Pendulum problem	-
dc.subject.mesh	Reinforcement learnings	-
dc.subject.mesh	Reward design	-
dc.subject.mesh	Shape	-
dc.subject.mesh	Sim2real	-
dc.subject.mesh	Task analysis	-
dc.title	Optimizing Reinforcement Learning Control Model in Furuta Pendulum and Transferring it to Real-World	-
dc.type	Article	-
dc.citation.endPage	95200	-
dc.citation.startPage	95195	-
dc.citation.title	IEEE Access	-
dc.citation.volume	11	-
dc.identifier.bibliographicCitation	IEEE Access, Vol.11, pp.95195-95200	-
dc.identifier.doi	10.1109/access.2023.3310405	-
dc.identifier.scopusid	2-s2.0-85169700982	-
dc.identifier.url	http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639	-
dc.subject.keyword	Furuta pendulum	-
dc.subject.keyword	inverted pendulum problem	-
dc.subject.keyword	reinforcement learning	-
dc.subject.keyword	reward design	-
dc.subject.keyword	Sim2Real	-
dc.description.isoa	true	-
dc.subject.subarea	Computer Science (all)	-
dc.subject.subarea	Materials Science (all)	-
dc.subject.subarea	Engineering (all)	-

Show simple item record

qrcode

트윗하기

Related Researcher

Koh, Jesung 고제성: Department of Mechanical Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download