Attentional decoder networks for chest X-ray image recognition on high-resolution features

Kang, Hankyul; Kim, Namkug; Ryu, Jongbin

Publication Year: 2024-06-01

Publisher: Elsevier Ireland Ltd

Citation: Computer Methods and Programs in Biomedicine, Vol.251

Keyword: Attention Fourier transform Medical image recognition Translation invariant Upsampling

Mesh Keyword: Attention Chest X-ray image Encoder-decoder Feature map Harmonic magnitudes High resolution Lower resolution Medical image recognition Translation invariants Upsampling Algorithms Humans Image Processing, Computer-Assisted Neural Networks, Computer Radiography, Thoracic

All Science Classification Codes (ASJC): Software Computer Science Applications Health Informatics

Abstract: Background and objective: This paper introduces an encoder–decoder-based attentional decoder network to recognize small-size lesions in chest X-ray images. In the encoder-only network, small-size lesions disappear during the down-sampling steps or are indistinguishable in the low-resolution feature maps. To address these issues, the proposed network processes images in the encoder–decoder architecture similar to U-Net families and classifies lesions by globally pooling high-resolution feature maps. However, two challenging obstacles prohibit U-Net families from being extended to classification: (1) the up-sampling procedure consumes considerable resources, and (2) there needs to be an effective pooling approach for the high-resolution feature maps. Methods: Therefore, the proposed network employs a lightweight attentional decoder and harmonic magnitude transform. The attentional decoder up-samples the given features with the low-resolution features as the key and value while the high-resolution features as the query. Since multi-scaled features interact, up-sampled features embody global context at a high resolution, maintaining pathological locality. In addition, harmonic magnitude transform is devised for pooling high-resolution feature maps in the frequency domain. We borrow the shift theorem of the Fourier transform to preserve the translation invariant property and further reduce the parameters of the pooling layer by an efficient embedding strategy. Results: The proposed network achieves state-of-the-art classification performance on the three public chest X-ray datasets, such as NIH, CheXpert, and MIMIC-CXR. Conclusions: In conclusion, the proposed efficient encoder–decoder network recognizes small-size lesions well in chest X-ray images by efficiently up-sampling feature maps through an attentional decoder and processing high-resolution feature maps with harmonic magnitude transform. We open-source our implementation at https://github.com/Lab-LVM/ADNet.

Language: eng

URI: https://dspace.ajou.ac.kr/dev/handle/2018.oak/34185

DOI: https://doi.org/10.1016/j.cmpb.2024.108198

Fulltext

Type: Article

Funding: This work was supported in part by Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korea Government (MSIT) (Artificial Intelligence Innovation Hub) under Grant 2021-0-02068 , under the Artificial Intelligence Convergence Innovation Human Resources Development ( RS-2023-00255968 ), a grant of the Korea Health Industry Development Institute (KHIDI) , funded by the Ministry of Health & Welfare, Republic of Korea ( HI22C0471 ), and Korea Health Technology R&D Project (KHIDI) funded by the MOHW under Grant RS-2023-00266038 Grant.

Show full item record

qrcode

트윗하기

Related Researcher

Ryu, Jongbin유종빈: Department of Software and Computer Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download