Ajou University repository

Blind Face Restoration Using Swin Transformer with Global Semantic Token Regularization
  • 신창종
Citations

SCOPUS

0

Citation Export

Advisor
허용석
Affiliation
아주대학교 대학원
Department
일반대학원 인공지능학과
Publication Year
2023-02
Publisher
The Graduate School, Ajou University
Keyword
Blind face restoration
Description
학위논문(석사)--아주대학교 일반대학원 :인공지능학과,2023. 2
Alternative Abstract
In this thesis, we propose a framework to solve the blind face restoration that <br>recover a high-quality face image from unknown degradations. Previous methods <br>have shown that the Vector Quantization (VQ) codebook can be powerful prior <br>to solve the blind face restoration. <br>However, it is still challenging to predict code vectors from low-quality im- <br>ages. To solve this problem, we propose a multi-scale transformer consisting of <br>multi-scale cross-attention (MSCA) blocks. The multi-scale transformer com- <br>pensates for lost information of high-level features by globally fusing low-level <br>and high-level features with different spatial resolutions. <br>Also, there is a trade-off problem between pixel-wise fidelity and visual qual- <br>ity of the results. To improve the fidelity of the results, we employ shifted win- <br>dow cross-attention modules at multiple scales. The shifted window method can <br>not calculate inter-window attention to model the abundant facial global con- <br>text. To solve this problem, we propose a shifted window token cross-attention <br>module SW-TCAFM with a global class token to model the global context of <br>face. The global class token models the global context by aggregating informa- <br>tion across all windows and passing it to the next step. In addition, we propose a <br>semantic token regularization loss that makes each global class token represents <br>a specific face component by utilizing the face parsing map prior. <br>Our framework achieves superior performance in both quality and fidelity <br>compared to state-of-the-art methods. In our experiments, we show that the <br>PSNR and FID results of our framework are better than 3.21% and 2.92%, <br>respectively, compared to state-of-the-art method.
Language
eng
URI
https://dspace.ajou.ac.kr/handle/2018.oak/24490
Fulltext

Type
Thesis
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Total Views & Downloads

File Download

  • There are no files associated with this item.