Ajou University repository

Text-free diffusion inpainting using reference images for enhanced visual fidelityoa mark
Citations

SCOPUS

0

Citation Export

DC Field Value Language
dc.contributor.authorKim, Beomjo-
dc.contributor.authorSohn, Kyung Ah-
dc.date.issued2024-10-01-
dc.identifier.issn0167-8655-
dc.identifier.urihttps://dspace.ajou.ac.kr/dev/handle/2018.oak/34536-
dc.description.abstractThis paper presents a novel approach to subject-driven image generation that addresses the limitations of traditional text-to-image diffusion models. Our method generates images using reference images without relying on language-based prompts. We introduce a visual detail preserving module that captures intricate details and textures, addressing overfitting issues associated with limited training samples. The model's performance is further enhanced through a modified classifier-free guidance technique and feature concatenation, enabling the natural positioning and harmonization of subjects within diverse scenes. Quantitative assessments using CLIP, DINO and Quality scores (QS), along with a user study, demonstrate the superior quality of our generated images. Our work highlights the potential of pre-trained models and visual patch embeddings in subject-driven editing, balancing diversity and fidelity in image generation tasks. Our implementation is available at https://github.com/8eomio/Subject-Inpainting.-
dc.description.sponsorshipThis research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2024-2021-0-02051), the Artificial Intelligence Convergence Innovation Human Resources Development (IITP-2024-RS2023-00255968) grant, and Grant RS-2021-II212068 (Artificial Intelligence Innovation Hub), supervised by the Institute for Information & Communications Technology Planning & Evaluation (IITP), and also by the National Research Foundation of Korea(NRF) grant (No. NRF2022R1A2C1007434).-
dc.language.isoeng-
dc.publisherElsevier B.V.-
dc.subject.meshDiffusion model-
dc.subject.meshFree diffusion-
dc.subject.meshImage diffusion-
dc.subject.meshImage generations-
dc.subject.meshImage Inpainting-
dc.subject.meshImage manipulation-
dc.subject.meshInpainting-
dc.subject.meshReference image-
dc.subject.meshSubject-driven generation-
dc.subject.meshVisual fidelity-
dc.titleText-free diffusion inpainting using reference images for enhanced visual fidelity-
dc.typeArticle-
dc.citation.endPage228-
dc.citation.startPage221-
dc.citation.titlePattern Recognition Letters-
dc.citation.volume186-
dc.identifier.bibliographicCitationPattern Recognition Letters, Vol.186, pp.221-228-
dc.identifier.doi10.1016/j.patrec.2024.10.009-
dc.identifier.scopusid2-s2.0-85207069794-
dc.identifier.urlhttps://www.sciencedirect.com/science/journal/01678655-
dc.subject.keywordDiffusion models-
dc.subject.keywordImage generation-
dc.subject.keywordImage inpainting-
dc.subject.keywordImage manipulation-
dc.subject.keywordSubject-driven generation-
dc.description.isoatrue-
dc.subject.subareaSoftware-
dc.subject.subareaSignal Processing-
dc.subject.subareaComputer Vision and Pattern Recognition-
dc.subject.subareaArtificial Intelligence-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Sohn, Kyung-Ah Image
Sohn, Kyung-Ah손경아
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.