Ajou University repository

Attentional decoder networks for chest X-ray image recognition on high-resolution features
Citations

SCOPUS

2

Citation Export

Publication Year
2024-06-01
Publisher
Elsevier Ireland Ltd
Citation
Computer Methods and Programs in Biomedicine, Vol.251
Keyword
AttentionFourier transformMedical image recognitionTranslation invariantUpsampling
Mesh Keyword
AttentionChest X-ray imageEncoder-decoderFeature mapHarmonic magnitudesHigh resolutionLower resolutionMedical image recognitionTranslation invariantsUpsamplingAlgorithmsHumansImage Processing, Computer-AssistedNeural Networks, ComputerRadiography, Thoracic
All Science Classification Codes (ASJC)
SoftwareComputer Science ApplicationsHealth Informatics
Abstract
Background and objective: This paper introduces an encoder–decoder-based attentional decoder network to recognize small-size lesions in chest X-ray images. In the encoder-only network, small-size lesions disappear during the down-sampling steps or are indistinguishable in the low-resolution feature maps. To address these issues, the proposed network processes images in the encoder–decoder architecture similar to U-Net families and classifies lesions by globally pooling high-resolution feature maps. However, two challenging obstacles prohibit U-Net families from being extended to classification: (1) the up-sampling procedure consumes considerable resources, and (2) there needs to be an effective pooling approach for the high-resolution feature maps. Methods: Therefore, the proposed network employs a lightweight attentional decoder and harmonic magnitude transform. The attentional decoder up-samples the given features with the low-resolution features as the key and value while the high-resolution features as the query. Since multi-scaled features interact, up-sampled features embody global context at a high resolution, maintaining pathological locality. In addition, harmonic magnitude transform is devised for pooling high-resolution feature maps in the frequency domain. We borrow the shift theorem of the Fourier transform to preserve the translation invariant property and further reduce the parameters of the pooling layer by an efficient embedding strategy. Results: The proposed network achieves state-of-the-art classification performance on the three public chest X-ray datasets, such as NIH, CheXpert, and MIMIC-CXR. Conclusions: In conclusion, the proposed efficient encoder–decoder network recognizes small-size lesions well in chest X-ray images by efficiently up-sampling feature maps through an attentional decoder and processing high-resolution feature maps with harmonic magnitude transform. We open-source our implementation at https://github.com/Lab-LVM/ADNet.
Language
eng
URI
https://dspace.ajou.ac.kr/dev/handle/2018.oak/34185
DOI
https://doi.org/10.1016/j.cmpb.2024.108198
Fulltext

Type
Article
Funding
This work was supported in part by Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korea Government (MSIT) (Artificial Intelligence Innovation Hub) under Grant 2021-0-02068 , under the Artificial Intelligence Convergence Innovation Human Resources Development ( RS-2023-00255968 ), a grant of the Korea Health Industry Development Institute (KHIDI) , funded by the Ministry of Health & Welfare, Republic of Korea ( HI22C0471 ), and Korea Health Technology R&D Project (KHIDI) funded by the MOHW under Grant RS-2023-00266038 Grant.
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Ryu, Jongbin Image
Ryu, Jongbin유종빈
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.