Dynamic Reinforcement Learning for Optimal Go AI Training: Adaptive Adjustment and Optimization

Zhang, Chunjiong; Shan, Gaoyang; Lim, Junghyun; Roh, Byeong Hee

DC Field	Value	Language
dc.contributor.author	Zhang, Chunjiong	-
dc.contributor.author	Shan, Gaoyang	-
dc.contributor.author	Lim, Junghyun	-
dc.contributor.author	Roh, Byeong Hee	-
dc.date.issued	2024-01-01	-
dc.identifier.uri	https://dspace.ajou.ac.kr/dev/handle/2018.oak/34577	-
dc.description.abstract	Go is a popular strategy game today, but due to its large search space and task complexity, ensuring stable AI implementation is challenging. Specifically, Go AI training requires setting a fixed optimal learning rate and schedule, which demands significant TPU and GPU resources. To facilitate Go-AI learning, this research explores adaptive adjustment and optimization techniques for dynamic reinforcement learning neural networks. First, we introduce a dynamic batch size technique that adjusts data volume at each training phase and incorporates dynamic network structure search, considering the number of network layers and residual blocks. Second, we propose a dynamic network topology that automatically modifies the learning rate based on the training batch size at various training phases. Our approach outperforms the baseline in terms of stability and model convergence speed. In 100 games, the Go-AI model achieved a 100% victory rate below the 7th rank and a 98% win rate at the 9th rank and higher.	-
dc.description.sponsorship	This work was supported partially by the Brain Korea 21 (BK21) FOUR program of the National Research Foundation of Korea funded by the Ministry of Education (NRF5199991514504). (Corresponding author: Byeong-hee Roh, E-mail: bhroh@ajou.ac.kr) C. Zhang, J. Lim and B. Roh are with the Department of AI Convergence Network, Ajou University, Suwon, 16499, Korea.(E-mail: {cjz, wjdguszoqt, bhroh}@ajou.ac.kr) G. Shan is with the Department of Software and Computer Engineering, Ajou University, Suwon, 16499, Korea.(E-mail: shanyang166@ajou.ac.kr) Manuscript received xxx; revised xxx.	-
dc.language.iso	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.subject.mesh	Adaptive	-
dc.subject.mesh	Adaptive adjustment	-
dc.subject.mesh	Adaptive optimization	-
dc.subject.mesh	Batch sizes	-
dc.subject.mesh	Dynamic reinforcements	-
dc.subject.mesh	Go AI	-
dc.subject.mesh	Learning rates	-
dc.subject.mesh	Neural-networks	-
dc.subject.mesh	Training phasis	-
dc.subject.mesh	Weight	-
dc.title	Dynamic Reinforcement Learning for Optimal Go AI Training: Adaptive Adjustment and Optimization	-
dc.type	Article	-
dc.citation.title	IEEE Transactions on Consumer Electronics	-
dc.identifier.bibliographicCitation	IEEE Transactions on Consumer Electronics	-
dc.identifier.doi	10.1109/tce.2024.3487141	-
dc.identifier.scopusid	2-s2.0-85208278249	-
dc.identifier.url	https://ieeexplore.ieee.org/servlet/opac?punumber=30	-
dc.subject.keyword	adaptive	-
dc.subject.keyword	Go AI	-
dc.subject.keyword	learning rate	-
dc.subject.keyword	neural networks	-
dc.subject.keyword	weights	-
dc.description.isoa	false	-
dc.subject.subarea	Media Technology	-
dc.subject.subarea	Electrical and Electronic Engineering	-

Show simple item record

qrcode

트윗하기

Related Researcher

SHAN GAOYANGSHAN, GAOYANG: Department of Software and Computer Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download