Ajou University repository

On-policy Deep Reinforcement Learning for HPC Job Scheduling
  • 정현석
Citations

SCOPUS

0

Citation Export

Advisor
Sangyoon Oh
Affiliation
아주대학교 대학원
Department
일반대학원 인공지능학과
Publication Year
2024-02
Publisher
The Graduate School, Ajou University
Keyword
deep reinforcement learninghigh-performance computingjob schedulingself-attention
Description
학위논문(석사)--인공지능학과,2024. 2
Abstract
Job scheduling in High-Performance Computing (HPC) systems is a crucial task that determines the allocation of computational resources. Traditional heuristic algorithms often fail to fully capture the complexity of job scheduling. Reinforcement learning (RL) offers promising advancements. However, the performance of on-policy RL algorithms can be significantly influenced by the job data, leading to variability in performance. To enhance performance stability, we propose a novel dynamic data selection method. We predict the reward value using a tree-based machine learning model and select the data based on this prediction. This unique data selection process refines the input to the RL algorithm, improving performance stability. Furthermore, we introduce a self-attention-based on-policy network for job scheduling in HPC systems. This network more effectively utilizes the selected data when formulating policies. We validate our proposed method through experiments based on real-world job log data from HPC systems, comparing its performance with other heuristic scheduling algorithms. The results confirm the effectiveness of our approach in enhancing performance stability across real-world workloads and improving the overall performance of on- policy RL algorithm.
Language
eng
URI
https://aurora.ajou.ac.kr/handle/2018.oak/39352
Journal URL
https://dcoll.ajou.ac.kr/dcollection/common/orgView/000000033345
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Total Views & Downloads

File Download

  • There are no files associated with this item.