AI Videos

All News Deals Social Blogs Videos Podcasts Digests

Grok 4 - 10 New Things to Know

•July 10, 2025

AI Explained

AI Explained•Jul 10, 2025

Why It Matters

If Grok 4’s benchmark gains translate to real-world use, it could shift enterprise and developer choices and intensify competition among model providers—but cost, hallucination risk and uneven multimodal performance mean buyers must assess actual utility, not just headline scores.

Summary

XAI’s Grok 4 debuts as a top-performing large language model, outperforming rival models on several academic, coding and fluid-intelligence benchmarks and scoring particularly well on the semi-private ARC AGI2 test. Elon Musk and XAI tout “postgraduate/PhD-level” performance, but the presenter cautions this is benchmark-dependent, prone to hallucinations, and sometimes slow or weaker on visual tasks. Grok 4’s Heavy variant uses parallel agent “study group” reasoning to boost results, and a premium Super Grok Heavy tier is priced at $300/month with planned features like video generation. Benchmarks are also criticized for selective comparisons and scale exaggeration, so practical superiority and value versus cheaper alternatives such as Gemini Pro remain uncertain.

Original Description

Grok 4 is here, but did you know these 10 things about the new model? From benchmark caveats to soloing science, $300 a month secrets to Grok 5 promises, here's 10 new things to know in just under 12 minutes.

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:

00:00 - Introduction

00:22 - Benchmark Results

02:11 - Benchmark Caveats

02:59 - ARC-AGI 2

03:35 - SimpleBench

04:49 - ‘Humanity’s Last Exam’

07:20 - SuperGrok Heavy Price

07:58 - API Price

08:12 - Grok 5, Gemini 3.0 Beta, GPT-5

09:12 - System Prompt Change + $1B a month, pollution

10:20 - Not soloing science, helping you solo code

Livestream: https://www.youtube.com/watch?v=1tQ_KrlHgfg&t=1s

Price: https://grok.com/#subscribe

https://x.com/ArtificialAnlys/status/1943166841150644622

Gemini DeepThink: https://blog.google/technology/google-deepmind/google-gemini-updates-io-2025/#deep-think

https://simple-bench.com/

ARC-AGI 2: https://x.com/arcprize/status/1943168950763950555

Humanity’s Last Exam: https://agi.safe.ai/

SmartGPT: https://www.youtube.com/watch?v=hVade_8H8mE

New Power Plant, 1m GPUs: https://www.tomshardware.com/tech-industry/artificial-intelligence/elon-musk-xai-power-plant-overseas-to-power-1-million-gpus

Gemini 3.0 beta: https://web.archive.org/web/20250709174548/https://github.com/google-gemini/gemini-cli/blob/b0cce952860b9ff51a0f731fbb8a7649ead23530/packages/cli/src/ui/utils/errorParsing.test.ts

Pollution: https://www.theguardian.com/technology/2025/apr/24/elon-musk-xai-memphis

https://www.youtube.com/watch?v=C8rU4dv2w8Q

https://www.youtube.com/watch?v=3VJT2JeDCyw

System Prompt: https://github.com/xai-org/grok-prompts/blob/535aa67a6221ce4928761335a38dea8e678d8501/ask_grok_system_prompt.j2

Burn Rate: https://www.bloomberg.com/news/articles/2025-06-17/musk-s-xai-burning-through-1-billion-a-month-as-costs-pile-up

Ron Johnson: https://x.com/jdcmedlock/status/1939814516503847259

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

Comments

Want to join the conversation?

Loading comments...