Alibaba’s new open-source AI model turns photos into film-quality clips

Alibaba Group Holding’s AI and cloud computing unit on Wednesday released the Wan2.2-S2V tool, its latest open-source artificial intelligence model that generates expressive, film-quality character videos from a static image and an audio clip.

Advertisement

The new model forms part of Alibaba Cloud’s Wan2.2 family, which the company last month touted as the AI industry’s first open-source large video-generation models incorporating the so-called Mixture-of-Experts (MoE) architecture. Hangzhou-based Alibaba owns the Post.

Powered by advanced audio-driven animation technology, the Wan2.2-S2V model “delivers lifelike character performances, ranging from natural dialogue to musical performances, and seamlessly handles multiple characters within a scene”, Alibaba Cloud said on Wednesday.

Wan2.2-S2V could be used by professional content creators to “capture precise visual representations tailored to specific storytelling and design requirements”, the company said. That enhancement was attributed to the model’s large-scale audiovisual data set tailored to film and television production scenarios, it added.

The latest Wan2.2 variant reflects how Chinese AI companies are continuing to narrow the gap with their US peers through the open-source approach, which makes the source code of AI models available for third-party developers to use, modify and distribute.

Alibaba Cloud’s Wan2.2 family was designed to meet the diverse needs of professional AI-generated content creators. Photo: Handout
Alibaba Cloud’s Wan2.2 family was designed to meet the diverse needs of professional AI-generated content creators. Photo: Handout

Wan2.2-S2V can now be downloaded from online developer platforms Hugging Face and GitHub, as well as from Alibaba Cloud’s ModelScope open-source community.

  

Read More

Leave a Reply