We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. While its image counterpart, the image-to-image synthesis problem, is a popular topic, the video-to-video synthesis problem is less explored in the literature - github

This is an image
- liquiddemocracy.org
Without understanding temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality. In this paper, we propose a novel video-to-video synthesis approach under the generative adversarial learning framework.

This is an image
- liquiddemocracy.org
Through carefully-designed generator and discriminator architectures, coupled with a spatial-temporal adversarial objective, we achieve high-resolution, photorealistic, temporally coherent video results on a diverse set of input formats including segmentation masks, sketches, and poses. Experiments on multiple benchmarks show the advantage of our method compared to strong baselines.

This is an image
- liquiddemocracy.org
In particular, our model is capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long, which significantly advances the state-of-the-art of video synthesis. Finally, we apply our approach to future video prediction, outperforming several state-of-the-art competing systems.
YOUTUBE GRQuRcpf5Gc AI-Based Video-to-Video Synthesis. Broadcast on 9 Sep 2018.
# Software
The paper "Video-to-Video Synthesis" and its source code is available here:
- https://tcwang0509.github.io/vid2vid/ - https://github.com/NVIDIA/vid2vid
# PKároly Zsolnai-Fehér's links:
- Two Minute Papers on facebook
- Two Minute Papers on - twitter
- TU Wien: https://cg.tuwien.ac.at/~zsolnai/
# Tags