Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Bu, Xiao Qing | - |
dc.contributor.author | Sun, Yu Kuan | - |
dc.contributor.author | Wang, Jian Ming | - |
dc.contributor.author | Liu, Kun Liang | - |
dc.contributor.author | Liang, Jia Yu | - |
dc.contributor.author | Jin, Guang Hao | - |
dc.contributor.author | Chung, Tae Sun | - |
dc.date.issued | 2021-09-17 | - |
dc.identifier.uri | https://dspace.ajou.ac.kr/dev/handle/2018.oak/31611 | - |
dc.description.abstract | With the aid of one manually annotated frame, One-Shot Video Object Segmentation (OSVOS) uses a CNN architecture to tackle the problem of semi-supervised video object segmentation (VOS). However, annotating a pixel-level segmentation mask is expensive and time-consuming. To alleviate the problem, we explore a language interactive way of initializing semi-supervised VOS and run the semi-supervised methods into a weakly supervised mode. Our contributions are two folds: (i) we propose a variant of OSVOS initialized with referring expressions (REVOS), which locates a target object by maximizing the matching score between all the candidates and the referring expression; (ii) segmentation performance of semi-supervised VOS methods varies dramatically when selecting different frames for annotation. We present a strategy of the best annotation frame selection by using image similarity measurement. Meanwhile, we first to propose a multiple frame annotation selection strategy for initialization of semi-supervised VOS with more than one annotated frames. Finally we evaluate our method on DAVIS-2016 dataset, and experimental results show that REVOS achieves similar performance (79.94% measured by average IoU) compared with OSVOS (80.1%). Although current REVOS implementation is specific to the method of one-shot video object segmentation, it can be more widely applicable to other semi-supervised VOS methods. | - |
dc.description.sponsorship | This work was supported by The Tianjin Science and Technology Program under grant 19PTZWHZ00020 and the National Natural Science Foundation of China under grant 61902281. | - |
dc.description.sponsorship | This study is funded by The Tianjin Science and Technology Program(19PTZWHZ00020) and National Natural Science Foundation of China (Grant No. 61902281). | - |
dc.language.iso | eng | - |
dc.publisher | Elsevier B.V. | - |
dc.subject.mesh | Frame selection | - |
dc.subject.mesh | Image similarity | - |
dc.subject.mesh | Interactive way | - |
dc.subject.mesh | Referring expressions | - |
dc.subject.mesh | Segmentation masks | - |
dc.subject.mesh | Segmentation performance | - |
dc.subject.mesh | Semi-supervised method | - |
dc.subject.mesh | Video-object segmentation | - |
dc.title | Weakly supervised video object segmentation initialized with referring expression | - |
dc.type | Article | - |
dc.citation.endPage | 765 | - |
dc.citation.startPage | 754 | - |
dc.citation.title | Neurocomputing | - |
dc.citation.volume | 453 | - |
dc.identifier.bibliographicCitation | Neurocomputing, Vol.453, pp.754-765 | - |
dc.identifier.doi | 10.1016/j.neucom.2020.06.129 | - |
dc.identifier.scopusid | 2-s2.0-85092720316 | - |
dc.identifier.url | www.elsevier.com/locate/neucom | - |
dc.subject.keyword | Natural Language Processing | - |
dc.subject.keyword | Referring Expression | - |
dc.subject.keyword | Video Object Segmentation | - |
dc.description.isoa | false | - |
dc.subject.subarea | Computer Science Applications | - |
dc.subject.subarea | Cognitive Neuroscience | - |
dc.subject.subarea | Artificial Intelligence | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.