SCOPUS
0Citation Export
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Sangyoon Oh | - |
| dc.contributor.author | 정현석 | - |
| dc.date.issued | 2024-02 | - |
| dc.identifier.other | 33345 | - |
| dc.identifier.uri | https://aurora.ajou.ac.kr/handle/2018.oak/39352 | - |
| dc.description | 학위논문(석사)--인공지능학과,2024. 2 | - |
| dc.description.abstract | Job scheduling in High-Performance Computing (HPC) systems is a crucial task that determines the allocation of computational resources. Traditional heuristic algorithms often fail to fully capture the complexity of job scheduling. Reinforcement learning (RL) offers promising advancements. However, the performance of on-policy RL algorithms can be significantly influenced by the job data, leading to variability in performance. To enhance performance stability, we propose a novel dynamic data selection method. We predict the reward value using a tree-based machine learning model and select the data based on this prediction. This unique data selection process refines the input to the RL algorithm, improving performance stability. Furthermore, we introduce a self-attention-based on-policy network for job scheduling in HPC systems. This network more effectively utilizes the selected data when formulating policies. We validate our proposed method through experiments based on real-world job log data from HPC systems, comparing its performance with other heuristic scheduling algorithms. The results confirm the effectiveness of our approach in enhancing performance stability across real-world workloads and improving the overall performance of on- policy RL algorithm. | - |
| dc.description.tableofcontents | Ⅰ INTRODUCTION 1_x000D_ <br>Ⅱ RELATED WORKS 4_x000D_ <br> 1. HPC Job Scheduling 4_x000D_ <br> 2. Reinforcement Learning-based Job Scheduling 5_x000D_ <br> 3. Data Selection for Reinforcement Learning 6_x000D_ <br>Ⅲ BACKGROUND 8_x000D_ <br> 1. Overview of Reinforcement Learning 8_x000D_ <br> 2. Off-Policy and On-Policy Reinforcement Learning 8_x000D_ <br> 3. Proximal Policy Optimization 10_x000D_ <br> 4. Self-Attention Mechanism 11_x000D_ <br>Ⅳ DYNAMIC DATA SELECTION WITH DEEP REINFORCEMENT LEARNING AGENT 12_x000D_ <br> 1. Dynamic Data Selection 13_x000D_ <br> 2. The complexity of the DS 16_x000D_ <br> 3. Self-Attention-based Actor-Critic Network 16_x000D_ <br> 4. Data Selection and Self-Attention Actor-Critic Network Algorithm 19_x000D_ <br>Ⅴ EXPERIMENTS 21_x000D_ <br> A. Experiments Setup 21_x000D_ <br> 1. HPC job data 21_x000D_ <br> 2. Compared Algorithms 22_x000D_ <br> 3. DS-DRL Evaluation 23_x000D_ <br> 4. Evaluation Metrics 24_x000D_ <br> B. Experimental results and analytics 26_x000D_ <br> 1. Evaluation of Dynamic Data Selection in Reward Prediction 26_x000D_ <br> 2. Impact of Data Selection Method on System Performance 27_x000D_ <br> 3. Comparative Analysis of Scheduling Algorithms on Average Bounded Slowdown 30_x000D_ <br> 4. Comparative Analysis of Scheduling Algorithms on Waiting Time 35_x000D_ <br> 5. Comparative Evaluation with other real-world datasets 35_x000D_ <br>Ⅵ CONCLUSION 38_x000D_ <br>REFERENCE 39_x000D_ | - |
| dc.language.iso | eng | - |
| dc.publisher | The Graduate School, Ajou University | - |
| dc.rights | 아주대학교 논문은 저작권에 의해 보호받습니다. | - |
| dc.title | On-policy Deep Reinforcement Learning for HPC Job Scheduling | - |
| dc.type | Thesis | - |
| dc.contributor.affiliation | 아주대학교 대학원 | - |
| dc.contributor.alternativeName | Hyunseok Jung | - |
| dc.contributor.department | 일반대학원 인공지능학과 | - |
| dc.date.awarded | 2024-02 | - |
| dc.description.degree | Master | - |
| dc.identifier.url | https://dcoll.ajou.ac.kr/dcollection/common/orgView/000000033345 | - |
| dc.subject.keyword | deep reinforcement learning | - |
| dc.subject.keyword | high-performance computing | - |
| dc.subject.keyword | job scheduling | - |
| dc.subject.keyword | self-attention | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.