A Coarse-to-Fine Framework for Automatic Video Unscreen
IEEE Transactions on Multimedia (TMM) 2022


Person Replacement


Video unscreen, a technique to extract foreground from given videos, has been playing an important role in today's video production pipeline. Existing systems developed for this purpose which mainly rely on video segmentation or video matting, either suffer from quality deficiencies or requiring tedious manual annotations. In this work, we aim to develop a fully automatic video unscreen framework that is able to obtain high-quality foreground extraction without the need of human intervention in a controlled environment.

Inspired by the alpha composition equation, our frame adopts a coarse-to-fine strategy, where the obtained background estimate given an initial mask prediction in turn helps the refinement of the mask. We conducted experiments on two datasets, 1) the Adobe's Synthetic-Composite dataset, and 2) DramaStudio, our newly collected large-scale green screen video matting dataset, exhibiting the controlled environments. The results show that the proposed framework outperforms existing algorithms and commercial software, both quantitatively and qualitatively. We also demonstrate its utility in person replacement in videos, which can further support a variety of video editing applications.



The pipeline of the proposed automatic video unscreen system. The coarse prediction has semantic information but the boundary is not perfect. The prediction with background information provides fine-grained boundary information but is noisy. Integrating them produces a better result. The detailed comparison is shown at the bottom.



The website template was borrowed from Michaƫl Gharbi.