Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yeo, Sangho | - |
dc.contributor.author | Oh, Sangyoon | - |
dc.contributor.author | Lee, Minsu | - |
dc.date.issued | 2021-07-01 | - |
dc.identifier.issn | 1573-1413 | - |
dc.identifier.uri | https://aurora.ajou.ac.kr/handle/2018.oak/31153 | - |
dc.identifier.uri | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85079434531&origin=inward | - |
dc.description.abstract | The use of demonstrations for deep reinforcement learning (RL) agents usually accelerates their training process as well as guides the agents to learn complicated policies. Most of the current deep RL approaches with demonstrations assume that there is a sufficient amount of high-quality demonstrations. However, for most real-world learning cases, the available demonstrations are often limited in terms of amount and quality. In this paper, we present an accelerated deep RL approach with dual replay buffer management and dynamic frame skipping on demonstrations. The dual replay buffer manager manages a human replay buffer and an actor replay buffer with independent sampling policies. We also propose dynamic frame skipping on demonstrations called DFS-ER (Dynamic Frame Skipping-Experience Replay) that learns the action repetition factor of the demonstrations. By implementing DFS-ER, we can accelerate deep RL by improving the efficiency of demonstration utilization, thereby yielding a faster exploration of the environment. We verified the training acceleration in three dense reward environments and one sparse reward environment compared to the conventional approach. In our evaluation using the Atari game environments, the proposed approach showed 21.7%-39.1% reduction in training iterations in a sparse reward environment. | - |
dc.description.sponsorship | This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858, 2018R1D1A1B07049923), the supercomputing department at KISTI (Korea Institute of Science and Technology Information) (K-19-L02-C07-S01), and Technology Innovation Program (P0006720) funded by MOTIE, Korea | - |
dc.description.sponsorship | This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858, 2018R1D1A1B07049923), the supercomputing department at KISTI (Korea Institute of Science and Technology Information) (K-19-L02-C07-S01), and Technology Innovation Program (P0006720) funded by MOTIE, Korea | - |
dc.language.iso | eng | - |
dc.publisher | Springer | - |
dc.subject.mesh | Buffer management | - |
dc.subject.mesh | Conventional approach | - |
dc.subject.mesh | Dynamic frame | - |
dc.subject.mesh | Experience replay | - |
dc.subject.mesh | Imitation learning | - |
dc.subject.mesh | Real-world learning | - |
dc.subject.mesh | Reinforcement learning agent | - |
dc.subject.mesh | Training acceleration | - |
dc.title | Accelerated deep reinforcement learning with efficient demonstration utilization techniques | - |
dc.type | Article | - |
dc.citation.endPage | 1297 | - |
dc.citation.number | 4 | - |
dc.citation.startPage | 1275 | - |
dc.citation.title | World Wide Web | - |
dc.citation.volume | 24 | - |
dc.identifier.bibliographicCitation | World Wide Web, Vol.24 No.4, pp.1275-1297 | - |
dc.identifier.doi | 10.1007/s11280-019-00763-0 | - |
dc.identifier.scopusid | 2-s2.0-85079434531 | - |
dc.identifier.url | https://link.springer.com/journal/11280 | - |
dc.subject.keyword | Deep reinforcement learning | - |
dc.subject.keyword | Dynamic frame skipping | - |
dc.subject.keyword | Experience replay | - |
dc.subject.keyword | Imitation learning | - |
dc.type.other | Article | - |
dc.identifier.pissn | 1386-145X | - |
dc.description.isoa | false | - |
dc.subject.subarea | Software | - |
dc.subject.subarea | Hardware and Architecture | - |
dc.subject.subarea | Computer Networks and Communications | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.