On-policy Deep Reinforcement Learning for HPC Job Scheduling

정현석

DC Field	Value	Language
dc.contributor.advisor	Sangyoon Oh	-
dc.contributor.author	정현석	-
dc.date.issued	2024-02	-
dc.identifier.other	33345	-
dc.identifier.uri	https://aurora.ajou.ac.kr/handle/2018.oak/39352	-
dc.description	학위논문(석사)--인공지능학과,2024. 2	-
dc.description.abstract	Job scheduling in High-Performance Computing (HPC) systems is a crucial task that determines the allocation of computational resources. Traditional heuristic algorithms often fail to fully capture the complexity of job scheduling. Reinforcement learning (RL) offers promising advancements. However, the performance of on-policy RL algorithms can be significantly influenced by the job data, leading to variability in performance. To enhance performance stability, we propose a novel dynamic data selection method. We predict the reward value using a tree-based machine learning model and select the data based on this prediction. This unique data selection process refines the input to the RL algorithm, improving performance stability. Furthermore, we introduce a self-attention-based on-policy network for job scheduling in HPC systems. This network more effectively utilizes the selected data when formulating policies. We validate our proposed method through experiments based on real-world job log data from HPC systems, comparing its performance with other heuristic scheduling algorithms. The results confirm the effectiveness of our approach in enhancing performance stability across real-world workloads and improving the overall performance of on- policy RL algorithm.	-
dc.description.tableofcontents	Ⅰ INTRODUCTION 1_x000D_ <br>Ⅱ RELATED WORKS 4_x000D_ <br> 1. HPC Job Scheduling 4_x000D_ <br> 2. Reinforcement Learning-based Job Scheduling 5_x000D_ <br> 3. Data Selection for Reinforcement Learning 6_x000D_ <br>Ⅲ BACKGROUND 8_x000D_ <br> 1. Overview of Reinforcement Learning 8_x000D_ <br> 2. Off-Policy and On-Policy Reinforcement Learning 8_x000D_ <br> 3. Proximal Policy Optimization 10_x000D_ <br> 4. Self-Attention Mechanism 11_x000D_ <br>Ⅳ DYNAMIC DATA SELECTION WITH DEEP REINFORCEMENT LEARNING AGENT 12_x000D_ <br> 1. Dynamic Data Selection 13_x000D_ <br> 2. The complexity of the DS 16_x000D_ <br> 3. Self-Attention-based Actor-Critic Network 16_x000D_ <br> 4. Data Selection and Self-Attention Actor-Critic Network Algorithm 19_x000D_ <br>Ⅴ EXPERIMENTS 21_x000D_ <br> A. Experiments Setup 21_x000D_ <br> 1. HPC job data 21_x000D_ <br> 2. Compared Algorithms 22_x000D_ <br> 3. DS-DRL Evaluation 23_x000D_ <br> 4. Evaluation Metrics 24_x000D_ <br> B. Experimental results and analytics 26_x000D_ <br> 1. Evaluation of Dynamic Data Selection in Reward Prediction 26_x000D_ <br> 2. Impact of Data Selection Method on System Performance 27_x000D_ <br> 3. Comparative Analysis of Scheduling Algorithms on Average Bounded Slowdown 30_x000D_ <br> 4. Comparative Analysis of Scheduling Algorithms on Waiting Time 35_x000D_ <br> 5. Comparative Evaluation with other real-world datasets 35_x000D_ <br>Ⅵ CONCLUSION 38_x000D_ <br>REFERENCE 39_x000D_	-
dc.language.iso	eng	-
dc.publisher	The Graduate School, Ajou University	-
dc.rights	아주대학교 논문은 저작권에 의해 보호받습니다.	-
dc.title	On-policy Deep Reinforcement Learning for HPC Job Scheduling	-
dc.type	Thesis	-
dc.contributor.affiliation	아주대학교 대학원	-
dc.contributor.alternativeName	Hyunseok Jung	-
dc.contributor.department	일반대학원 인공지능학과	-
dc.date.awarded	2024-02	-
dc.description.degree	Master	-
dc.identifier.url	https://dcoll.ajou.ac.kr/dcollection/common/orgView/000000033345	-
dc.subject.keyword	deep reinforcement learning	-
dc.subject.keyword	high-performance computing	-
dc.subject.keyword	job scheduling	-
dc.subject.keyword	self-attention	-

Show simple item record

qrcode

트윗하기

Total Views & Downloads

File Download

There are no files associated with this item.