Ajou University repository

Accelerated deep reinforcement learning with efficient demonstration utilization techniques
Citations

SCOPUS

0

Citation Export

Publication Year
2021-07-01
Journal
World Wide Web
Publisher
Springer
Citation
World Wide Web, Vol.24 No.4, pp.1275-1297
Keyword
Deep reinforcement learningDynamic frame skippingExperience replayImitation learning
Mesh Keyword
Buffer managementConventional approachDynamic frameExperience replayImitation learningReal-world learningReinforcement learning agentTraining acceleration
All Science Classification Codes (ASJC)
SoftwareHardware and ArchitectureComputer Networks and Communications
Abstract
The use of demonstrations for deep reinforcement learning (RL) agents usually accelerates their training process as well as guides the agents to learn complicated policies. Most of the current deep RL approaches with demonstrations assume that there is a sufficient amount of high-quality demonstrations. However, for most real-world learning cases, the available demonstrations are often limited in terms of amount and quality. In this paper, we present an accelerated deep RL approach with dual replay buffer management and dynamic frame skipping on demonstrations. The dual replay buffer manager manages a human replay buffer and an actor replay buffer with independent sampling policies. We also propose dynamic frame skipping on demonstrations called DFS-ER (Dynamic Frame Skipping-Experience Replay) that learns the action repetition factor of the demonstrations. By implementing DFS-ER, we can accelerate deep RL by improving the efficiency of demonstration utilization, thereby yielding a faster exploration of the environment. We verified the training acceleration in three dense reward environments and one sparse reward environment compared to the conventional approach. In our evaluation using the Atari game environments, the proposed approach showed 21.7%-39.1% reduction in training iterations in a sparse reward environment.
ISSN
1573-1413
Language
eng
URI
https://aurora.ajou.ac.kr/handle/2018.oak/31153
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85079434531&origin=inward
DOI
https://doi.org/10.1007/s11280-019-00763-0
Journal URL
https://link.springer.com/journal/11280
Type
Article
Funding
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858, 2018R1D1A1B07049923), the supercomputing department at KISTI (Korea Institute of Science and Technology Information) (K-19-L02-C07-S01), and Technology Innovation Program (P0006720) funded by MOTIE, KoreaThis research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858, 2018R1D1A1B07049923), the supercomputing department at KISTI (Korea Institute of Science and Technology Information) (K-19-L02-C07-S01), and Technology Innovation Program (P0006720) funded by MOTIE, Korea
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Oh, Sangyoon Image
Oh, Sangyoon오상윤
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.