Google's generative AI can now analyze hours of video

Kyle Wiggers

Updated June 5, 2024 at 7:14 p.m.·4 min read

Gemini, Google’s family of generative AI models, can now analyze longer documents, codebases, videos and audio recordings than before.

During a keynote at the Google I/O 2024 developer conference Tuesday, Google announced the private preview of a new version of Gemini 1.5 Pro, the company’s current flagship model, that can take in up to 2 million tokens. That’s double the previous maximum amount.

At 2 million tokens, the new version of Gemini 1.5 Pro supports the largest input of any commercially available model. The next-largest, Anthropic’s Claude 3, tops out at 1 million tokens.

In the AI field, “tokens” refer to subdivided bits of raw data, like the syllables “fan,” “tas” and “tic” in the word “fantastic.” Two million tokens is equivalent to around 1.4 million words, two hours of video or 22 hours of audio.

Beyond being able to analyze large files, models that can take in more tokens can sometimes achieve improved performance.

Unlike models with small maximum token inputs (otherwise known as context), models such as the 2-million-token-input Gemini 1.5 Pro won’t easily “forget” the content of very recent conversations and veer off topic. Large-context models can also better grasp the flow of data they take in — hypothetically, at least — and generate contextually richer responses.

Developers interested in trying Gemini 1.5 Pro with a 2-million-token context can add their names to the waitlist in Google AI Studio, Google’s generative AI dev tool. (Gemini 1.5 Pro with 1-million-token context launches in general availability across Google's developer services and surfaces in the next month.)

Beyond the larger context window, Google says that Gemini 1.5 Pro has been “enhanced” over the last few months through algorithmic improvements. It’s better at code generation, logical reasoning and planning, multi-turn conversation, and audio and image understanding, Google says. And in the Gemini API and AI Studio, 1.5 Pro can now reason across audio in addition to images and video — and be “steered” through a capability called system instructions.

Gemini 1.5 Flash, a faster model

For less demanding applications, Google’s launching in public preview Gemini 1.5 Flash, a “distilled” version of Gemini 1.5 Pro that’s small and efficient model built for “narrow,” “high-frequency” generative AI workloads. Flash — which has up to a 2-million-token context window — is multimodal like Gemini 1.5 Pro, meaning it can analyze audio, video and images as well as text (but it generates only text).

“Gemini Pro is for much more general or complex, often multi-step reasoning tasks,” Josh Woodward, VP of Google Labs, one of Google’s experimental AI divisions, said during a briefing with reporters. “[But] as a developer, you really want to use [Flash] if you care a lot about the speed of the model output.”

Woodward added that Flash is particularly well-suited for tasks such as summarization, chat apps, image and video captioning and data extraction from long documents and tables.

Flash appears to be Google’s answer to small, low-cost models served via APIs like Anthropic’s Claude 3 Haiku. It, along with Gemini 1.5 Pro, is very widely available, now in over 200 countries and territories including the European Economic Area, U.K. and Switzerland. (The 2-million-token context version is gated behind a waitlist, however.)

https://twitter.com/Google/status/1790432952767115432

In another update aimed at cost-conscious devs, all Gemini models, not just Flash, will soon be able to take advantage of a feature called context caching. This lets devs store large amounts of information (say, a knowledge base or database of research papers) in a cache that Gemini models can quickly and relatively cheaply (from a per-usage standpoint) access.

The complimentary Batch API, available in public preview today in Vertex AI, Google's enterprise-focused generative AI development platform, offers a more cost-effective way to handle workloads such as classification and sentiment analysis, data extraction and description generation, allowing multiple prompts to be sent to Gemini models in a single request.

Another new feature arriving later in the month in preview in Vertex, controlled generation, could lead to further cost savings, Woodward suggests, by allowing users to define Gemini model outputs according to specific formats or schemas (e.g. JSON or XML).

“You’ll be able to send all of your files to the model once and not have to resend them over and over again,” Woodward said. “This should make the long context [in particular] way more useful — and also more affordable.”

Read more about Google I/O 2024 on TechCrunch

Associated Press
Homeless people say they will likely return to sites if California clears them under Newsom's order
Three years ago, Joel Hernandez built a small wooden shack under the 405 freeway cutting through Los Angeles. Hernandez has had similar homes be cleared in homeless encampment sweeps by state or city authorities over the years, so the 62-year-old is taking in stride that his days in his makeshift shelter on state-owned land might be numbered. California Gov. Gavin Newsom on Thursday issued an executive order directing state agencies to start clearing homeless encampments on state land, including lots under freeways.
KOCO - Oklahoma City Videos
Oklahoma Turnpike Authority reassures water safety regarding turnpike plan
Oklahoma Turnpike Authority reassures water safety regarding turnpike plan
The Canadian Press
Judge strikes down one North Carolina abortion restriction but upholds another
RALEIGH, N.C. (AP) — A federal judge ruled Friday that a provision in North Carolina's abortion laws requiring doctors to document the location of a pregnancy before prescribing abortion pills should be blocked permanently, affirming that it was too vague to be enforced reasonably.
USA TODAY
Video shows flaming object streaking across sky in Mexico, could be remnants of rocket
A flaming object was spotted soaring across the sky in Chihuahua, Mexico, could be a Japanese satellite rocket named Michibik that launched in 2010.
NFL Highlights
Sara Walsh: Bucs' revamped defense prepared to lean on CB Zyon McCollum
NFL Network's Sara Walsh explains that Tampa Bay Buccaneers head coach Todd Bowles raves about the potential of third-year cornerback Zyon McCollum, who himself is preparing for a larger role on this revamped Bucs defense.
MMA Junkie
Alexa Grasso vs. Valentina Shevchenko 3 to co-headline Noche UFC at Sphere
The trilogy fight between UFC women's flyweight champion Alexa Grasso and Valentina Shevchenko is finally happening. Grasso vs. Shevchenko 3 will take place Sept. 14 at Sphere in Las Vegas as the co-main event of Noche UFC (also known as UFC 306), UFC CEO…
USA TODAY
Go inside Green Apple Books, a legacy business and San Francisco favorite since 1967
Green Apple Books is a San Francisco Legacy business, selling new and used books in all subject areas to curious readers since 1967.
NFL Highlights
Deebo Samuel on Brandon Aiyuk's contract situation: 'Things are gonna work out'
San Francisco 49ers wide receiver Deebo Samuel relates to teammate Brandon Aiyuk's contract situation, saying that he thinks "things are gonna work out" despite his holdout.
BBC
Vulnerable, messy and bratty: The pop girlies having a moment
Sabrina Carpenter, Charli XCX and Chappell Roan are providing us with the soundtrack to summer.
Yahoo Sports
Sean O'Malley and Merab Dvalishvili to headline UFC 306 in title fight at the Sphere in Las Vegas
Sean O'Malley and Merab Dvalishvilli will square off in their bantamweight title fight on Sept. 14.
The Canadian Press
Jon Stewart pushes VA to cover troops sickened by uranium after 9/11. Again, they are told to wait
Comedian Jon Stewart and troops sickened by uranium ended a meeting Friday at the Department of Veterans Affairs angry that once again they have been told they will have to wait to see whether the VA will connect their illnesses to the toxic base where they were deployed shortly after 9/11.
KNXV - Phoenix Scripps
Pinal County election officials prepare for the primary
Election officials in Pinal County are preparing for the upcoming primary election.
WFTS-Tampa
Winter Haven lightning alert system proves successful at alerting residents of several threats
Lightning can quickly turn dangerous. According to St. Petersburg Fire Rescue four teens were struck while standing under a Banyan tree on Thursday. “Two of the patients were found on the ground. One of the patients that was most serious was in cardiac arrest. Our crews began CPR and to provide care to that particular patient,” said Richard Gomolak, St. Petersburg Fire Rescue Operations Chief. Some parts of Pinellas County have invested in lightning detection systems. ABC Action News was there as crews installed them in Clearwater Beach a few months ago.
NFL Highlights
Bucky Brooks examines rosters with foundation to win Super Bowl 'The Insiders'
In a segment on "The Insiders", NFL Network's Bucky Brooks examines the rosters of the Kansas City Chiefs, with foundation to win Super Bowl 'The Insiders'
Evening Standard
Liverpool vs Real Betis LIVE! Friendly match stream, latest score and updates after Szoboszlai goal
Arne Slot takes charge of first match as Liverpool
The Canadian Press
Key Pakistani Islamist party begins sit-in to protest increase in electricity bills
ISLAMABAD (AP) — Hundreds of supporters of a key Islamist party began a sit-in protest in the garrison city of Rawalpindi late Friday after authorities detained dozens to prevent them from holding the rally in Pakistan's neighboring capital, citing security reasons, officials said.
PSG Talk
PSG Ready to Outbid Rivals by Exceeding Release Clause for Barcelona and Chelsea Target
According to a recent report from SPORT’s Toni Juanmartí, Nico Williams is about to decide his future. He has three options: join Barça, move to Paris Saint-Germain, or stay at Athletic Club. Opi...
USA TODAY Sports - Golfweek
Taylor Pendrith leads, Matthew NeSmith eyes playoff bid, Scott Gutschewski misses cut but son makes U.S. Junior Am final among 5 things to know at 3M Open
There's plenty of scoreboard watching going on at the 2024 3M Open
NFL Highlights
Rapoport: 'Pretty astounding' numbers for Tua Tagovailoa's Dolphins extension 'The Insiders'
In a segment on "The Insiders", NFL Network insider Ian Rapoport discusses the "pretty astounding" numbers for Miami Dolphins quarterback Tua Tagovailoa's Dolphins extension.
Simply Wall St.
Why The 35% Return On Capital At Fortescue (ASX:FMG) Should Have Your Attention
If you're not sure where to start when looking for the next multi-bagger, there are a few key trends you should keep an...

Gemini 1.5 Flash, a faster model

Latest Stories