Close Menu
Technophile NewsTechnophile News
  • Home
  • News
  • PC
  • Phones
  • Android
  • Gadgets
  • Games
  • Guides
  • Accessories
  • Reviews
  • Spotlight
  • More
    • Artificial Intelligence
    • Web Stories
    • Press Release
What's On
Herman Miller Is Having a Surprise Flash Sale on Office Chairs

Herman Miller Is Having a Surprise Flash Sale on Office Chairs

24 November 2025
Jony Ive and Sam Altman say they finally have an AI hardware prototype

Jony Ive and Sam Altman say they finally have an AI hardware prototype

24 November 2025
What Kilian Jornet Packs to Climb Every 14er in the Mountain West in 31 days

What Kilian Jornet Packs to Climb Every 14er in the Mountain West in 31 days

24 November 2025
‘Holy shit’: Gemini 3 is winning the AI race — for now

‘Holy shit’: Gemini 3 is winning the AI race — for now

24 November 2025
Amazon Is Using Specialized AI Agents for Deep Bug Hunting

Amazon Is Using Specialized AI Agents for Deep Bug Hunting

24 November 2025
Facebook X (Twitter) Instagram
  • Privacy
  • Terms
  • Advertise
  • Contact Us
Monday, November 24
Facebook X (Twitter) Instagram YouTube
Technophile NewsTechnophile News
Demo
  • Home
  • News
  • PC
  • Phones
  • Android
  • Gadgets
  • Games
  • Guides
  • Accessories
  • Reviews
  • Spotlight
  • More
    • Artificial Intelligence
    • Web Stories
    • Press Release
Technophile NewsTechnophile News
Home » ‘Holy shit’: Gemini 3 is winning the AI race — for now
News

‘Holy shit’: Gemini 3 is winning the AI race — for now

By News Room24 November 20259 Mins Read
Facebook Twitter Pinterest LinkedIn Telegram Tumblr Reddit WhatsApp Email
‘Holy shit’: Gemini 3 is winning the AI race — for now
Share
Facebook Twitter LinkedIn Pinterest Email

When an AI model release immediately spawns memes and treatises declaring the rest of the industry cooked, you know you’ve got something worth dissecting.

Google’s Gemini 3 was released Tuesday to widespread fanfare. The company called the model a “new era of intelligence,” integrating it into Google Search on day one for the first time. It’s blown past OpenAI and other competitors’ products on a range of benchmarks and is topping the charts on LMArena, a crowdsourced AI evaluation platform that’s essentially the Billboard Hot 100 of AI model ranking. Within 24 hours of its launch, more than one million users tried Gemini 3 in Google AI Studio and the Gemini API, per Google. “From a day one adoption standpoint, [it’s] the best we’ve seen from any of our model releases,” Google DeepMind’s Logan Kilpatrick, who is product lead for Google’s AI Studio and the Gemini API, told The Verge.

Even OpenAI CEO Sam Altman and xAI CEO Elon Musk publicly congratulated the Gemini team on a job well done. And Salesforce CEO Marc Benioff wrote that after using ChatGPT every day for three years, spending two hours on Gemini 3 changed everything: “Holy shit … I’m not going back. The leap is insane — reasoning, speed, images, video… everything is sharper and faster. It feels like the world just changed, again.”

“This is more than a leaderboard shuffle,” said Wei-Lin Chiang, cofounder and CTO of LMArena. Chiang told The Verge that Gemini 3 Pro holds a “clear lead” in occupational categories including coding, match, and creative writing, and its agentic coding abilities “in many cases now surpass top coding models like Claude 4.5 and GPT-5.1.” It also got the top spot on visual comprehension and was the first model to surpass a ~1500 score on the platform’s text leaderboard.

The new model’s performance, Chiang said, “illustrates that the AI arms race is being shaped by models that can reason more abstractly, generalize more consistently, and deliver dependable results across an increasingly diverse set of real-world evaluations.”

Alex Conway, principal software engineer at DataRobot, told The Verge that one of Gemini 3’s most notable advancements was on a specific reasoning benchmark called ARC-AGI-2. Gemini scored almost twice as high as OpenAI’s GPT-5 Pro while running at one-tenth of the cost per task, he said, which is “really challenging the notion that these models are plateauing.” And on the SimpleQA benchmark — which involves simple questions and answers on a broad range of topics, and requires a lot of niche knowledge — Gemini 3 Pro scored more than twice as high as OpenAI’s GPT-5.1, Conway flagged. “Use case-wise, it’ll be great for a lot more niche topics and diving deep into state-of-the-art research and scientific fields,” he said.

But leaderboards aren’t everything. It’s possible — and in the high-pressure AI world, tempting — to train a model for narrow benchmarks rather than general-purpose success. So to really know how well a system is doing, you have to rely on real-world testing, anecdotal experience, and complex use cases in the wild.

The Verge spoke with professionals across disciplines who use AI every day for work. The consensus: Gemini 3 looks impressive, and it does a great job on a wide breadth of tasks — but when it comes to edge cases and niche aspects of certain industries, many professionals won’t be replacing their current models with it anytime soon.

The majority of people The Verge spoke with plan to continue to use Anthropic’s Claude for their coding needs, despite Gemini 3’s advancements in that space. Some also said that Gemini 3 isn’t optimal on the user interaction front. Tim Dettmers, assistant professor at Carnegie Mellon University and a research scientist at Ai2, said that though it’s a “great model,” it’s a bit raw when it comes to UX, meaning “it doesn’t follow instructions precisely.”

Tulsee Doshi, Google DeepMind’s senior director of product management for Gemini and Gen Media, told The Verge that the company prioritized bringing Gemini 3 to a variety of Google products in a “very real way.” When asked about the instruction-following concerns, she said it’s been helpful to see “where folks are hitting some of the sticking points.”

She also said that since the Pro model is the first release in the Gemini 3 suite, later models will help “round out that concern.”

Joel Hron, CTO of Thomson Reuters, said that the company has its own internal benchmarks it’s developed to rank both its internal models and public ones on the areas that are most relevant to their work — like comparing two documents up to several hundreds of pages in length, interpreting a long document, understanding legal contracts, and reasoning in the legal and tax spaces. He said that so far, Gemini 3 has performed strongly across all of them and is “a significant jump up from where Gemini 2.5 was.” It also outperforms several of Anthropic’s and OpenAI’s models right now in some of those areas.

Louis Blankemeier, cofounder and CEO of Cognita, a radiology AI startup, said that in terms of “pure numbers” Gemini 3 is “super exciting.” But, he said, “we still need some time to figure out what the real-world utility of this model is.” For more general domains, Blankemeier said, Gemini 3 is a star, but when he played around with it for radiology, it struggled with correctly identifying subtle rib fractures on chest X-rays, as well as uncommon or rare conditions. He calls radiology akin to self-driving cars in many ways, with a lot of edge cases — so a newer, more powerful model may still not be as effective as an older one that’s been refined and trained on custom data over time. “The real world is just so much more difficult,” he said.

Similarly, Matt Hoffman, head of AI at Longeye, a company providing AI tools for law enforcement investigations, sees promise in the Gemini 3 Pro-powered Nano Banana Pro image generator. Image generators allow Longeye to create convincing synthetic datasets for testing, letting it keep real, sensitive investigation data secure. But although the benchmarks are impressive, they may not map to the company’s actual use cases. “I’m not confident Longeye could swap out a model we’re using in production for Gemini 3 and see immediate improvements,” he said.

Other companies also say they’re excited about Gemini — but not necessarily using it to replace everything else. Built, a construction lending startup, currently uses a mix of foundational models from Google, Anthropic, OpenAI, and others to analyze construction draw requests — a package of documents often sent to a construction lender, like invoices and proof of work done, requesting that funds be paid. This requires multimodal analysis of text and images, plus a large context window for the main agent delegating tasks to the others, VP of engineering Thomas Schlegel told The Verge. That’s part of what Google promises with Gemini 3, so the company is currently exploring switching it out for 2.5.

“In the past we’ve found Gemini to be the best at all-purpose tasks, and 3 looks to be a big step forward along those same lines,” Schlegel said. “It’s everything we love about Gemini on steroids.” But he doesn’t yet think it will replace all the other models, including Claude for coding tasks and OpenAI products for business reasoning.

For Tanmai Gopal, cofounder and CEO of AI agent platform PromptQL, the stir Gemini 3 has caused is valid, but “it’s definitely not the end of anything” for Google’s competitors. AI models are becoming better and cheaper, and since they’re on such quick release cycles, “one is always ahead of the pack for a period of time.” (For instance, the day after Gemini 3 came out, OpenAI released GPT-5.1-Codex-Max, an update to a week-old model, ostensibly to challenge Gemini 3 on a few coding benchmarks.)

Gopal said PromptQL is still working on internal evaluations to decide how, if at all, the team’s model choices will change, but “initial results aren’t necessarily showing something drastically better” than their current lineup. He said his current preference is Claude for code generation, ChatGPT for web search, and GPT-5 Pro for “deep brainstorming,” but he may incorporate Gemini 3 as a default model, since it’s “probably best-in-class for consumer tasks across creative, text, [and] image.”

And like virtually every model, Gemini 3 has had moments of what I’ll dub “robotic hand syndrome” — when an AI system does something complex with flying colors but gets gobsmacked by the simplest query, akin to the robotic hands of yesteryear having trouble gripping a soda can. Famed researcher Andrej Karpathy, who was a founding member of OpenAI and former director of AI at Tesla, wrote on X after testing Gemini 3 that he “had a positive early impression yesterday across personality, writing, vibe coding, humor, etc., very solid daily driver potential, clearly a tier 1 LLM,” but he noted that the model refused to believe him when he said it was 2025 and later said it had forgotten to turn on Google Search. (He ascertained that in early testing, he may have been given a model with a stale system prompt.)

In The Verge’s own experience testing Gemini 3, we found it “delivers reasonably well — with caveats.” It likely won’t stay on top forever, but it’s an unmistakable step up for the company.

“You’re sort of in this leapfrog game from model to model, month to month, when a new one drops,” Hron said. “But what stuck to me about Google’s release is it makes substantial improvements across many dimensions of models — so it’s not like it just got better at coding or it just got better at reasoning … It really, across the board, got a good bit better.”

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

  • Hayden Field

    Hayden Field

    Posts from this author will be added to your daily email digest and your homepage feed.

    See All by Hayden Field

  • AI

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All AI

  • Google

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Google

  • OpenAI

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All OpenAI

  • Report

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Report

  • Tech

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Tech

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related News

Herman Miller Is Having a Surprise Flash Sale on Office Chairs

Herman Miller Is Having a Surprise Flash Sale on Office Chairs

24 November 2025
Jony Ive and Sam Altman say they finally have an AI hardware prototype

Jony Ive and Sam Altman say they finally have an AI hardware prototype

24 November 2025
What Kilian Jornet Packs to Climb Every 14er in the Mountain West in 31 days

What Kilian Jornet Packs to Climb Every 14er in the Mountain West in 31 days

24 November 2025
Amazon Is Using Specialized AI Agents for Deep Bug Hunting

Amazon Is Using Specialized AI Agents for Deep Bug Hunting

24 November 2025
Why college students prefer News Daddy over The New York Times

Why college students prefer News Daddy over The New York Times

24 November 2025
The Best Apple Watch

The Best Apple Watch

24 November 2025
Top Articles
The Best Pizza Ovens to Make the Perfect Pie

The Best Pizza Ovens to Make the Perfect Pie

9 November 202526 Views
The Best Air Purifiers of 2025 for Dust, Smoke, and Allergens

The Best Air Purifiers of 2025 for Dust, Smoke, and Allergens

26 September 202515 Views
25 Amazon Prime Perks You Might Not Be Using

25 Amazon Prime Perks You Might Not Be Using

18 September 202513 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Don't Miss
Why college students prefer News Daddy over The New York Times

Why college students prefer News Daddy over The New York Times

24 November 2025

AnkitAnkit Khanal gets his news from News Daddy. More than 20 times a day, Khanal,…

The Best Apple Watch

The Best Apple Watch

24 November 2025
OnePlus 15R will get a late launch next month

OnePlus 15R will get a late launch next month

24 November 2025
Best 5 Bread Makers of 2025, Tested and Reviewed by WIRED Experts

Best 5 Bread Makers of 2025, Tested and Reviewed by WIRED Experts

24 November 2025
Technophile News
Facebook X (Twitter) Instagram Pinterest YouTube Dribbble
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2025 Technophile News. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.