VideoPoet // new LLM for video generation by Google

2 min readDec 26, 2023

Generative video is very hot market for coming years don’t fuck up it.

Here are a few key points about the research described in the provided documents:

- The researchers propose VideoPoet, a large language model for video generation. It employs a transformer architecture that can process multimodal inputs like images, videos, text, and audio.

- VideoPoet is trained in two stages — pretraining and task-specific adaptation. Pretraining uses a mixture of multimodal generative objectives like text-to-video, video prediction, video inpainting, etc. The pretrained model serves as a foundation for adapting to various video generation tasks.

- Experiments demonstrate VideoPoet’s capabilities in zero-shot video generation, especially in producing realistic motions driven by text prompts. It also shows promise in coherent long video generation and converting images to videos.

- Compared to diffusion models commonly used in video generation, VideoPoet as a language model can more easily combine diverse training objectives within a single architecture. This provides flexibility in adapting it to new tasks without major architectural changes.

- Evaluations show VideoPoet achieves state-of-the-art results in text-to-video generation benchmarks. Human evaluations also indicate it generates more interesting and realistic motions compared to other recent models.

- Key advantages highlighted are the ability to leverage existing optimizations for language models, combine multiple tasks flexibly, and demonstrate zero-shot generalization capabilities. VideoPoet illustrates the potential of large language models for high-fidelity video generation.

https://sites.research.google/videopoet/

https://twitter.com/CodeByPoonam/status/1739556881511890958

Competition in generative video

VideoPoet // new LLM for video generation by Google

Competition in generative video

Pika 1.0 vs Runway: here's the real difference! #pika #pikalabs #runwayml #futureofai #aivideo

This difference shows us two different paths the future of AI video will go. More videos and predictions to come.

Written by sbagency

No responses yet