Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Beomjo | - |
dc.contributor.author | Sohn, Kyung Ah | - |
dc.date.issued | 2024-10-01 | - |
dc.identifier.issn | 0167-8655 | - |
dc.identifier.uri | https://dspace.ajou.ac.kr/dev/handle/2018.oak/34536 | - |
dc.description.abstract | This paper presents a novel approach to subject-driven image generation that addresses the limitations of traditional text-to-image diffusion models. Our method generates images using reference images without relying on language-based prompts. We introduce a visual detail preserving module that captures intricate details and textures, addressing overfitting issues associated with limited training samples. The model's performance is further enhanced through a modified classifier-free guidance technique and feature concatenation, enabling the natural positioning and harmonization of subjects within diverse scenes. Quantitative assessments using CLIP, DINO and Quality scores (QS), along with a user study, demonstrate the superior quality of our generated images. Our work highlights the potential of pre-trained models and visual patch embeddings in subject-driven editing, balancing diversity and fidelity in image generation tasks. Our implementation is available at https://github.com/8eomio/Subject-Inpainting. | - |
dc.description.sponsorship | This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2024-2021-0-02051), the Artificial Intelligence Convergence Innovation Human Resources Development (IITP-2024-RS2023-00255968) grant, and Grant RS-2021-II212068 (Artificial Intelligence Innovation Hub), supervised by the Institute for Information & Communications Technology Planning & Evaluation (IITP), and also by the National Research Foundation of Korea(NRF) grant (No. NRF2022R1A2C1007434). | - |
dc.language.iso | eng | - |
dc.publisher | Elsevier B.V. | - |
dc.subject.mesh | Diffusion model | - |
dc.subject.mesh | Free diffusion | - |
dc.subject.mesh | Image diffusion | - |
dc.subject.mesh | Image generations | - |
dc.subject.mesh | Image Inpainting | - |
dc.subject.mesh | Image manipulation | - |
dc.subject.mesh | Inpainting | - |
dc.subject.mesh | Reference image | - |
dc.subject.mesh | Subject-driven generation | - |
dc.subject.mesh | Visual fidelity | - |
dc.title | Text-free diffusion inpainting using reference images for enhanced visual fidelity | - |
dc.type | Article | - |
dc.citation.endPage | 228 | - |
dc.citation.startPage | 221 | - |
dc.citation.title | Pattern Recognition Letters | - |
dc.citation.volume | 186 | - |
dc.identifier.bibliographicCitation | Pattern Recognition Letters, Vol.186, pp.221-228 | - |
dc.identifier.doi | 10.1016/j.patrec.2024.10.009 | - |
dc.identifier.scopusid | 2-s2.0-85207069794 | - |
dc.identifier.url | https://www.sciencedirect.com/science/journal/01678655 | - |
dc.subject.keyword | Diffusion models | - |
dc.subject.keyword | Image generation | - |
dc.subject.keyword | Image inpainting | - |
dc.subject.keyword | Image manipulation | - |
dc.subject.keyword | Subject-driven generation | - |
dc.description.isoa | true | - |
dc.subject.subarea | Software | - |
dc.subject.subarea | Signal Processing | - |
dc.subject.subarea | Computer Vision and Pattern Recognition | - |
dc.subject.subarea | Artificial Intelligence | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.