Ajou University repository

Semantic Segmentation Using Pixel-Wise Adaptive Label Smoothing via Self-Knowledge Distillation for Limited Labeling Dataoa mark
Citations

SCOPUS

5

Citation Export

Publication Year
2022-04-01
Publisher
MDPI
Citation
Sensors, Vol.22
Keyword
limited training dataregularizationself-knowledge distillationsemantic segmentation
Mesh Keyword
Ground truthLabelingsLimited training dataOverfittingPerformanceRegularisationRegularization methodsSelf-knowledge distillationSemantic segmentationTraining dataBiological PhenomenaHumansImage Processing, Computer-AssistedNeural Networks, ComputerSemantics
All Science Classification Codes (ASJC)
Analytical ChemistryInformation SystemsAtomic and Molecular Physics, and OpticsBiochemistryInstrumentationElectrical and Electronic Engineering
Abstract
To achieve high performance, most deep convolutional neural networks (DCNNs) require a significant amount of training data with ground truth labels. However, creating ground-truth labels for semantic segmentation requires more time, human effort, and cost compared with other tasks such as classification and object detection, because the ground-truth label of every pixel in an image is required. Hence, it is practically demanding to train DCNNs using a limited amount of training data for semantic segmentation. Generally, training DCNNs using a limited amount of data is problematic as it easily results in a decrease in the accuracy of the networks because of overfitting to the training data. Here, we propose a new regularization method called pixel-wise adaptive label smoothing (PALS) via self-knowledge distillation to stably train semantic segmentation networks in a practical situation, in which only a limited amount of training data is available. To mitigate the problem caused by limited training data, our method fully utilizes the internal statistics of pixels within an input image. Consequently, the proposed method generates a pixel-wise aggregated probability distribution using a similarity matrix that encodes the affinities between all pairs of pixels. To further increase the accuracy, we add one-hot encoded distributions with ground-truth labels to these aggregated distributions, and obtain our final soft labels. We demonstrate the effectiveness of our method for the Cityscapes dataset and the Pascal VOC2012 dataset using limited amounts of training data, such as 10%, 30%, 50%, and 100%. Based on various quantitative and qualitative comparisons, our method demonstrates more accurate results compared with previous methods. Specifically, for the Cityscapes test set, our method achieved mIoU improvements of 0.076%, 1.848%, 1.137%, and 1.063% for 10%, 30%, 50%, and 100% training data, respectively, compared with the method of the cross-entropy loss using one-hot encoding with ground truth labels.
ISSN
1424-8220
Language
eng
URI
https://dspace.ajou.ac.kr/dev/handle/2018.oak/32616
DOI
https://doi.org/10.3390/s22072623
Fulltext

Type
Article
Funding
Funding: This work has been supported in part by the BK21 FOUR program of the National Research Foundation of Korea funded by the Ministry of Education(NRF5199991014091) and in part by the Ministry of Science and ICT (MSIT), South Korea, under the Information Technology Research Center (ITRC) Support Program supervised by the Institute for Information and Communications Technology Promotion (IITP) under Grant IITP-2021-2018-0-01424.
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Heo,Yong Seok  Image
Heo,Yong Seok 허용석
Department of Electrical and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.