Object detection is a critical component in autonomous driving systems, requiring robust performance across diverse lighting conditions, including nighttime scenarios where RGB cameras underperform due to low visibility. Thermal imaging overcomes this limitation by capturing infrared radiation, facilitating reliable object detection in low-light conditions. However, the development of deep learning models for thermal object detection is hindered by the scarcity of large-scale annotated thermal datasets. To tackle this challenge, we propose a novel approach that enhances data efficiency for thermal object detection through domain separation techniques. Our method introduces a Domain Invariant Encoder (DIE) and a Domain Specific Encoder (DSE). The DIE maps thermal images into a shared feature space with RGB images by minimizing the H-divergence between their feature distributions via adversarial learning. This alignment facilitates the transfer of knowledge from pre-trained RGB models to the thermal domain. Concurrently, the DSE captures unique thermal characteristics by learning domain-specific features. We integrate the invariant and specific features using a cross-attention mechanism, allowing the model to leverage shared and unique information effectively. Evaluated on the FLIR ADAS dataset, our approach achieves state-of-the-art performance with limited thermal data, demonstrating significant improvements in mean Average Precision (mAP) over baseline models. This study addresses the challenge of limited annotated data for thermal object detection, advancing thermal imaging applications in autonomous driving.
Following are results of a study on the Convergence and Open Sharing System Project, supported by the Ministry of Education and National Research Foundation of Korea.