onMarch 16, 2025

Google YouTube Reading Capabilities

AI Tools Sneak Peek

1 min read

Demis Hassabis, the Co-Founder and CEO of Google DeepMind, shared the following X post – “One of the features I’ve mostly wanted in AI Studio for a long time! Just paste a YouTube link into the command line and ask Gemini 2.0 questions about it – it’s multimodal understanding is kind of mind blowing.”

This is yet another impressive suite of Gemini capabilities launched by Google last week.

What does this capability do?

Google Gemini can now talk to YouTube videos

You can copy a YouTube link in Google Gemini and ask it any questions.

But what’s new here? You could already do that using NotebookLM.

The secret lies in Demis’s mention of ‘multimodal understanding’. It means that you can actually talk to video and not just the text transcript of the video.

What does this mean?

In the attached demo, we uploaded Roger Federer’s famous video of Commencement address at Dartmouth. To test its multimodal capabilities, we asked the video: ‘What is Roger Federer wearing in this video?’

It correctly responded: “In this video, Roger Federer is wearing an academic robe, with a white and black collar over a black gown. The gown has a zipper up the front. The gown and collar are typical for graduation ceremonies.”

Then i asked it to summarize the video, which it did. This capability is still in experimental mode (hence is slow currently) and hence accessible via Google AI Studio.

Worth giving it a try. What peculiar use case of yours will this capability solve?

Anuj Magazine

onMarch 16, 2025

AI Tools Sneak Peek

Using Voice AI to speed up initial job screening

Google Canvas

Write a Comment

About

Jaspreet Bindra

CEO/Founder

Jaspreet Bindra is a leading voice in AI and digital transformation in India, with deep expertise in AI ethics, shaped by his postgraduate studies at Cambridge. As the founder of Tech Whisperer UK and author of The Tech Whisperer, he bridges business and technology, sharing insights through his work with Microsoft, TAS, and Mahindra Group, as well as teaching at Ashoka and Singularity University.

Cambridge, Gurugram

Anuj Magazine

CTO/CO-Founder

Anuj Magazine is a technologist and innovator with 16 U.S. patents, blending AI and cybersecurity expertise with leadership at McAfee, Citrix, and Walmart Global Tech. An alumnus of BITS Pilani and IIM Bangalore, he explores the evolving human-AI dynamic in his book What's Your Human Edge?, challenging conventional thinking in the tech space.