Semantic Segmentation Using Pixel-Wise Adaptive Label Smoothing via Self-Knowledge Distillation for Limited Labeling Data

Park, Sangyong; Kim, Jaeseon; Heo, Yong Seok

Publication Year: 2022-04-01

Publisher: MDPI

Citation: Sensors, Vol.22

Keyword: limited training data regularization self-knowledge distillation semantic segmentation

Mesh Keyword: Ground truth Labelings Limited training data Overfitting Performance Regularisation Regularization methods Self-knowledge distillation Semantic segmentation Training data Biological Phenomena Humans Image Processing, Computer-Assisted Neural Networks, Computer Semantics

All Science Classification Codes (ASJC): Analytical Chemistry Information Systems Atomic and Molecular Physics, and Optics Biochemistry Instrumentation Electrical and Electronic Engineering

Abstract: To achieve high performance, most deep convolutional neural networks (DCNNs) require a significant amount of training data with ground truth labels. However, creating ground-truth labels for semantic segmentation requires more time, human effort, and cost compared with other tasks such as classification and object detection, because the ground-truth label of every pixel in an image is required. Hence, it is practically demanding to train DCNNs using a limited amount of training data for semantic segmentation. Generally, training DCNNs using a limited amount of data is problematic as it easily results in a decrease in the accuracy of the networks because of overfitting to the training data. Here, we propose a new regularization method called pixel-wise adaptive label smoothing (PALS) via self-knowledge distillation to stably train semantic segmentation networks in a practical situation, in which only a limited amount of training data is available. To mitigate the problem caused by limited training data, our method fully utilizes the internal statistics of pixels within an input image. Consequently, the proposed method generates a pixel-wise aggregated probability distribution using a similarity matrix that encodes the affinities between all pairs of pixels. To further increase the accuracy, we add one-hot encoded distributions with ground-truth labels to these aggregated distributions, and obtain our final soft labels. We demonstrate the effectiveness of our method for the Cityscapes dataset and the Pascal VOC2012 dataset using limited amounts of training data, such as 10%, 30%, 50%, and 100%. Based on various quantitative and qualitative comparisons, our method demonstrates more accurate results compared with previous methods. Specifically, for the Cityscapes test set, our method achieved mIoU improvements of 0.076%, 1.848%, 1.137%, and 1.063% for 10%, 30%, 50%, and 100% training data, respectively, compared with the method of the cross-entropy loss using one-hot encoding with ground truth labels.

ISSN: 1424-8220

Language: eng

URI: https://dspace.ajou.ac.kr/dev/handle/2018.oak/32616

DOI: https://doi.org/10.3390/s22072623

Fulltext

Type: Article

Funding: Funding: This work has been supported in part by the BK21 FOUR program of the National Research Foundation of Korea funded by the Ministry of Education(NRF5199991014091) and in part by the Ministry of Science and ICT (MSIT), South Korea, under the Information Technology Research Center (ITRC) Support Program supervised by the Institute for Information and Communications Technology Promotion (IITP) under Grant IITP-2021-2018-0-01424.

Show full item record

qrcode

트윗하기

Related Researcher

Heo,Yong Seok 허용석: Department of Electrical and Computer Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download