Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhang, Chunjiong | - |
dc.contributor.author | Shan, Gaoyang | - |
dc.contributor.author | Lim, Junghyun | - |
dc.contributor.author | Roh, Byeong Hee | - |
dc.date.issued | 2024-01-01 | - |
dc.identifier.uri | https://dspace.ajou.ac.kr/dev/handle/2018.oak/34577 | - |
dc.description.abstract | Go is a popular strategy game today, but due to its large search space and task complexity, ensuring stable AI implementation is challenging. Specifically, Go AI training requires setting a fixed optimal learning rate and schedule, which demands significant TPU and GPU resources. To facilitate Go-AI learning, this research explores adaptive adjustment and optimization techniques for dynamic reinforcement learning neural networks. First, we introduce a dynamic batch size technique that adjusts data volume at each training phase and incorporates dynamic network structure search, considering the number of network layers and residual blocks. Second, we propose a dynamic network topology that automatically modifies the learning rate based on the training batch size at various training phases. Our approach outperforms the baseline in terms of stability and model convergence speed. In 100 games, the Go-AI model achieved a 100% victory rate below the 7th rank and a 98% win rate at the 9th rank and higher. | - |
dc.description.sponsorship | This work was supported partially by the Brain Korea 21 (BK21) FOUR program of the National Research Foundation of Korea funded by the Ministry of Education (NRF5199991514504). (Corresponding author: Byeong-hee Roh, E-mail: bhroh@ajou.ac.kr) C. Zhang, J. Lim and B. Roh are with the Department of AI Convergence Network, Ajou University, Suwon, 16499, Korea.(E-mail: {cjz, wjdguszoqt, bhroh}@ajou.ac.kr) G. Shan is with the Department of Software and Computer Engineering, Ajou University, Suwon, 16499, Korea.(E-mail: shanyang166@ajou.ac.kr) Manuscript received xxx; revised xxx. | - |
dc.language.iso | eng | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.subject.mesh | Adaptive | - |
dc.subject.mesh | Adaptive adjustment | - |
dc.subject.mesh | Adaptive optimization | - |
dc.subject.mesh | Batch sizes | - |
dc.subject.mesh | Dynamic reinforcements | - |
dc.subject.mesh | Go AI | - |
dc.subject.mesh | Learning rates | - |
dc.subject.mesh | Neural-networks | - |
dc.subject.mesh | Training phasis | - |
dc.subject.mesh | Weight | - |
dc.title | Dynamic Reinforcement Learning for Optimal Go AI Training: Adaptive Adjustment and Optimization | - |
dc.type | Article | - |
dc.citation.title | IEEE Transactions on Consumer Electronics | - |
dc.identifier.bibliographicCitation | IEEE Transactions on Consumer Electronics | - |
dc.identifier.doi | 10.1109/tce.2024.3487141 | - |
dc.identifier.scopusid | 2-s2.0-85208278249 | - |
dc.identifier.url | https://ieeexplore.ieee.org/servlet/opac?punumber=30 | - |
dc.subject.keyword | adaptive | - |
dc.subject.keyword | Go AI | - |
dc.subject.keyword | learning rate | - |
dc.subject.keyword | neural networks | - |
dc.subject.keyword | weights | - |
dc.description.isoa | false | - |
dc.subject.subarea | Media Technology | - |
dc.subject.subarea | Electrical and Electronic Engineering | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.