Ajou University repository

Weakly supervised video object segmentation initialized with referring expression
  • Bu, Xiao Qing ;
  • Sun, Yu Kuan ;
  • Wang, Jian Ming ;
  • Liu, Kun Liang ;
  • Liang, Jia Yu ;
  • Jin, Guang Hao ;
  • Chung, Tae Sun
Citations

SCOPUS

3

Citation Export

DC Field Value Language
dc.contributor.authorBu, Xiao Qing-
dc.contributor.authorSun, Yu Kuan-
dc.contributor.authorWang, Jian Ming-
dc.contributor.authorLiu, Kun Liang-
dc.contributor.authorLiang, Jia Yu-
dc.contributor.authorJin, Guang Hao-
dc.contributor.authorChung, Tae Sun-
dc.date.issued2021-09-17-
dc.identifier.urihttps://dspace.ajou.ac.kr/dev/handle/2018.oak/31611-
dc.description.abstractWith the aid of one manually annotated frame, One-Shot Video Object Segmentation (OSVOS) uses a CNN architecture to tackle the problem of semi-supervised video object segmentation (VOS). However, annotating a pixel-level segmentation mask is expensive and time-consuming. To alleviate the problem, we explore a language interactive way of initializing semi-supervised VOS and run the semi-supervised methods into a weakly supervised mode. Our contributions are two folds: (i) we propose a variant of OSVOS initialized with referring expressions (REVOS), which locates a target object by maximizing the matching score between all the candidates and the referring expression; (ii) segmentation performance of semi-supervised VOS methods varies dramatically when selecting different frames for annotation. We present a strategy of the best annotation frame selection by using image similarity measurement. Meanwhile, we first to propose a multiple frame annotation selection strategy for initialization of semi-supervised VOS with more than one annotated frames. Finally we evaluate our method on DAVIS-2016 dataset, and experimental results show that REVOS achieves similar performance (79.94% measured by average IoU) compared with OSVOS (80.1%). Although current REVOS implementation is specific to the method of one-shot video object segmentation, it can be more widely applicable to other semi-supervised VOS methods.-
dc.description.sponsorshipThis work was supported by The Tianjin Science and Technology Program under grant 19PTZWHZ00020 and the National Natural Science Foundation of China under grant 61902281.-
dc.description.sponsorshipThis study is funded by The Tianjin Science and Technology Program(19PTZWHZ00020) and National Natural Science Foundation of China (Grant No. 61902281).-
dc.language.isoeng-
dc.publisherElsevier B.V.-
dc.subject.meshFrame selection-
dc.subject.meshImage similarity-
dc.subject.meshInteractive way-
dc.subject.meshReferring expressions-
dc.subject.meshSegmentation masks-
dc.subject.meshSegmentation performance-
dc.subject.meshSemi-supervised method-
dc.subject.meshVideo-object segmentation-
dc.titleWeakly supervised video object segmentation initialized with referring expression-
dc.typeArticle-
dc.citation.endPage765-
dc.citation.startPage754-
dc.citation.titleNeurocomputing-
dc.citation.volume453-
dc.identifier.bibliographicCitationNeurocomputing, Vol.453, pp.754-765-
dc.identifier.doi10.1016/j.neucom.2020.06.129-
dc.identifier.scopusid2-s2.0-85092720316-
dc.identifier.urlwww.elsevier.com/locate/neucom-
dc.subject.keywordNatural Language Processing-
dc.subject.keywordReferring Expression-
dc.subject.keywordVideo Object Segmentation-
dc.description.isoafalse-
dc.subject.subareaComputer Science Applications-
dc.subject.subareaCognitive Neuroscience-
dc.subject.subareaArtificial Intelligence-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Chung, Tae-Sun Image
Chung, Tae-Sun정태선
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.