Accelerating Deep Reinforcement Learning Using Human Demonstration Data Based on Dual Replay Buffer Management and Online Frame Skipping

Yeo, Sangho; Oh, Sangyoon; Lee, Minsu

DC Field	Value	Language
dc.contributor.author	Yeo, Sangho	-
dc.contributor.author	Oh, Sangyoon	-
dc.contributor.author	Lee, Minsu	-
dc.date.issued	2019-04-01	-
dc.identifier.uri	https://aurora.ajou.ac.kr/handle/2018.oak/36420	-
dc.identifier.uri	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85064621678&origin=inward	-
dc.description.abstract	Human demonstration data plays an important role in the early stage of deep reinforcement learning to accelerate the training process as well as guiding a reinforcement learning agent to learn complicated policy. However, most of current reinforcement learning approaches with human demonstration data and reward assumes that there is a sufficient amount of high-quality human demonstration data and that is not true for most real-world learning cases where enough amount of experts' demonstration data is always limited. To overcome this limitation, we propose a novel deep reinforcement learning approach with a dual replay buffer management and online frame skipping for human demonstration data sampling. The dual replay buffer consists of a human replay memory, an actor replay memory, and a replay manager. And it can manage two replay buffers with independent sampling policies. We also propose an online frame skipping to fully utilize available human data. During the training period, the frame skipping is performed dynamically to human replay buffer where the all of human data is stored. Two online frame-skipping, namely, FS-ER(Frame Skipping-Experience Replay) and DFS-ER(Dynamic Frame Skipping-Experience Replay) are used to sample data from human replay buffer. We conducted empirical experiments of four popular Atari games and the results show that our proposed two online frame skipping with dual replay memory outperforms existing baselines. Specifically, DFS-ER shows the fastest score increment during the reinforcement learning procedure in three out of four experiments. FS-ER shows the best performance in the other environment that is hard to train a model because of sparse reward.	-
dc.description.sponsorship	This research was jointly supported by the National Research Foundation of Korea (NRF) funded by the MSIT (NRF-2018R1D1A1B07043858, NRF-2018R1D1A1B07049923, and NRF-2015R1C1A1A01054305).	-
dc.language.iso	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.subject.mesh	Buffer management	-
dc.subject.mesh	Empirical experiments	-
dc.subject.mesh	Human demonstrations	-
dc.subject.mesh	Imitation learning	-
dc.subject.mesh	Real-world learning	-
dc.subject.mesh	Reinforcement learning agent	-
dc.subject.mesh	Reinforcement learning approach	-
dc.subject.mesh	Training process	-
dc.title	Accelerating Deep Reinforcement Learning Using Human Demonstration Data Based on Dual Replay Buffer Management and Online Frame Skipping	-
dc.type	Conference	-
dc.citation.conferenceDate	2019.02.27.~2019.03.02.	-
dc.citation.conferenceName	2019 IEEE International Conference on Big Data and Smart Computing, BigComp 2019	-
dc.citation.edition	2019 IEEE International Conference on Big Data and Smart Computing, BigComp 2019 - Proceedings	-
dc.citation.title	2019 IEEE International Conference on Big Data and Smart Computing, BigComp 2019 - Proceedings	-
dc.identifier.bibliographicCitation	2019 IEEE International Conference on Big Data and Smart Computing, BigComp 2019 - Proceedings	-
dc.identifier.doi	10.1109/bigcomp.2019.8679366	-
dc.identifier.scopusid	2-s2.0-85064621678	-
dc.identifier.url	http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8671661	-
dc.subject.keyword	Deep Learning	-
dc.subject.keyword	Imitation Learning	-
dc.subject.keyword	Reinforcement Learning	-
dc.type.other	Conference Paper	-
dc.description.isoa	false	-
dc.subject.subarea	Information Systems and Management	-
dc.subject.subarea	Artificial Intelligence	-
dc.subject.subarea	Computer Networks and Communications	-
dc.subject.subarea	Information Systems	-

Show simple item record

qrcode

트윗하기

Related Researcher

Oh, Sangyoon오상윤: Department of Software and Computer Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download