하모니 서치 알고리즘을 이용한 심층 강화학습 하이퍼파라미터 최적화

Citations

SCOPUS

Citation Export

RIS(EndNote)

CSV(Excel)

XML

Abstract: This study demonstrates that using the Harmony Search Algorithm (HSA) for hyperparameter optimization in Deep Reinforcement Learning (DeepRL) is effective in environments with well designed reward functions. To address the reproducibility issue in DeepRL, the algorithm was modified to adopt the best parameters in each generation independent of the harmony memory consideration rate (HMCR) and to prevent the best parameters from being influenced by the pitch adjustment rate (PAR). The objective function was set as cumulative reward or terminal reward depending on the environment. The PPO algorithm parameters and actor-critic network parameters were optimized in five different environments. The results show that the harmony search algorithm can optimize hyperparameters even in large and complex environments with substantial interactions if the reward function is well-designed.

URI: https://aurora.ajou.ac.kr/handle/2018.oak/37833
https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002963515

qrcode