Accelerated deep reinforcement learning with efficient demonstration utilization techniques

Yeo, Sangho; Oh, Sangyoon; Lee, Minsu

DC Field	Value	Language
dc.contributor.author	Yeo, Sangho	-
dc.contributor.author	Oh, Sangyoon	-
dc.contributor.author	Lee, Minsu	-
dc.date.issued	2021-07-01	-
dc.identifier.issn	1573-1413	-
dc.identifier.uri	https://aurora.ajou.ac.kr/handle/2018.oak/31153	-
dc.identifier.uri	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85079434531&origin=inward	-
dc.description.abstract	The use of demonstrations for deep reinforcement learning (RL) agents usually accelerates their training process as well as guides the agents to learn complicated policies. Most of the current deep RL approaches with demonstrations assume that there is a sufficient amount of high-quality demonstrations. However, for most real-world learning cases, the available demonstrations are often limited in terms of amount and quality. In this paper, we present an accelerated deep RL approach with dual replay buffer management and dynamic frame skipping on demonstrations. The dual replay buffer manager manages a human replay buffer and an actor replay buffer with independent sampling policies. We also propose dynamic frame skipping on demonstrations called DFS-ER (Dynamic Frame Skipping-Experience Replay) that learns the action repetition factor of the demonstrations. By implementing DFS-ER, we can accelerate deep RL by improving the efficiency of demonstration utilization, thereby yielding a faster exploration of the environment. We verified the training acceleration in three dense reward environments and one sparse reward environment compared to the conventional approach. In our evaluation using the Atari game environments, the proposed approach showed 21.7%-39.1% reduction in training iterations in a sparse reward environment.	-
dc.description.sponsorship	This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858, 2018R1D1A1B07049923), the supercomputing department at KISTI (Korea Institute of Science and Technology Information) (K-19-L02-C07-S01), and Technology Innovation Program (P0006720) funded by MOTIE, Korea	-
dc.language.iso	eng	-
dc.publisher	Springer	-
dc.subject.mesh	Buffer management	-
dc.subject.mesh	Conventional approach	-
dc.subject.mesh	Dynamic frame	-
dc.subject.mesh	Experience replay	-
dc.subject.mesh	Imitation learning	-
dc.subject.mesh	Real-world learning	-
dc.subject.mesh	Reinforcement learning agent	-
dc.subject.mesh	Training acceleration	-
dc.title	Accelerated deep reinforcement learning with efficient demonstration utilization techniques	-
dc.type	Article	-
dc.citation.endPage	1297	-
dc.citation.number	4	-
dc.citation.startPage	1275	-
dc.citation.title	World Wide Web	-
dc.citation.volume	24	-
dc.identifier.bibliographicCitation	World Wide Web, Vol.24 No.4, pp.1275-1297	-
dc.identifier.doi	10.1007/s11280-019-00763-0	-
dc.identifier.scopusid	2-s2.0-85079434531	-
dc.identifier.url	https://link.springer.com/journal/11280	-
dc.subject.keyword	Deep reinforcement learning	-
dc.subject.keyword	Dynamic frame skipping	-
dc.subject.keyword	Experience replay	-
dc.subject.keyword	Imitation learning	-
dc.type.other	Article	-
dc.identifier.pissn	1386145X	-
dc.description.isoa	false	-
dc.subject.subarea	Software	-
dc.subject.subarea	Hardware and Architecture	-
dc.subject.subarea	Computer Networks and Communications	-

Show simple item record

qrcode

트윗하기

Related Researcher

Oh, Sangyoon오상윤: Department of Software and Computer Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download