This study demonstrates that using the Harmony Search Algorithm (HSA) for hyperparameter optimization in Deep Reinforcement Learning (DeepRL) is effective in environments with well designed reward functions. To address the reproducibility issue in DeepRL, the algorithm was modified to adopt the best parameters in each generation independent of the harmony memory consideration rate (HMCR) and to prevent the best parameters from being influenced by the pitch adjustment rate (PAR). The objective function was set as cumulative reward or terminal reward depending on the environment. The PPO algorithm parameters and actor-critic network parameters were optimized in five different environments. The results show that the harmony search algorithm can optimize hyperparameters even in large and complex environments with substantial interactions if the reward function is well-designed.