Write to get the Video, yes that’s what Meta AI has given us!

Vishnu Nandakumar
3 min readOct 4, 2022

Wow! Meta AI just dropped a huge nuclear bomb in the AI world, what if someone said we could make a video out of text a decade or more back we would have simply said “Yes and in our dreams”. Fast forward to a few years and now it's a reality. Jump into make-a-video to see the unimaginable develop into reality before your eyes. Below you can see one of the examples from the website.

GANs and their variations created a huge impact and changed the way people started to see AI and its growth. It led a lot of researchers to think beyond the boundaries and achieve the potential that was out there. Another incredible creation was the diffusion models which made the internet go crazy and create threads and accounts on different social media platforms. DALL-E is essentially a text-to-image generation model with an underlying diffusion-based model. DALL-E makes use of GPT3 text- embeddings to interpret and generate corresponding images. Stable diffusion is another model that produces images using text prompts with underlying CLIP text embeddings to understand the text prompts and generates images using the diffusion model. There have been a lot of further models that have come into existence these days like Midjourney

Outputs from different diffusion based-models

Unlike diffusion models, Make a Video works in a different manner. It doesn’t need a text-video paired dataset but makes use of pre-trained text-to-image models. They have used temporal diffusion-based models which accelerate the training process by instantaneously transferring the knowledge from a pre-trained T2I network to a new T2V one. Also since the temporal information gets included in the model initiation stage, this adds to decreased training time. Attention modules in the extended spatial-temporal help in learning the temporal dynamics of videos. Finally, the integration of super-resolution models and frame interpolation models increases the visual quality of the generated videos. As I have given only the brief and my understanding could also be different, it would be really great if you could through the research paper published.

Young couple walking in rain — prompt used in Meta AI example

Above is one of the examples that was provided by Meta-AI on their Make a Video website. You could also sign-up to get access to try out by filling out the form, probably they could put you on the waiting list also :X

That’s all folks for now, to me this is a piece of huge news from Meta AI. I hope you also felt the same.

--

--

Vishnu Nandakumar

Machine Learning Engineer, Cloud Computing (AWS), Arsenal Fan. Have a look at my page: https://bit.ly/m/vishnunandakumar