AI Videos
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

AI Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
AIVideosGrok 4 - 10 New Things to Know
AI

Grok 4 - 10 New Things to Know

•July 10, 2025
0
AI Explained
AI Explained•Jul 10, 2025

Why It Matters

If Grok 4’s benchmark gains translate to real-world use, it could shift enterprise and developer choices and intensify competition among model providers—but cost, hallucination risk and uneven multimodal performance mean buyers must assess actual utility, not just headline scores.

Summary

XAI’s Grok 4 debuts as a top-performing large language model, outperforming rival models on several academic, coding and fluid-intelligence benchmarks and scoring particularly well on the semi-private ARC AGI2 test. Elon Musk and XAI tout “postgraduate/PhD-level” performance, but the presenter cautions this is benchmark-dependent, prone to hallucinations, and sometimes slow or weaker on visual tasks. Grok 4’s Heavy variant uses parallel agent “study group” reasoning to boost results, and a premium Super Grok Heavy tier is priced at $300/month with planned features like video generation. Benchmarks are also criticized for selective comparisons and scale exaggeration, so practical superiority and value versus cheaper alternatives such as Gemini Pro remain uncertain.

Original Description

Grok 4 is here, but did you know these 10 things about the new model? From benchmark caveats to soloing science, $300 a month secrets to Grok 5 promises, here's 10 new things to know in just under 12 minutes.
AI Insiders ($9!): https://www.patreon.com/AIExplained
Chapters:
00:00 - Introduction
00:22 - Benchmark Results
02:11 - Benchmark Caveats
02:59 - ARC-AGI 2
03:35 - SimpleBench
04:49 - ‘Humanity’s Last Exam’
07:20 - SuperGrok Heavy Price
07:58 - API Price
08:12 - Grok 5, Gemini 3.0 Beta, GPT-5
09:12 - System Prompt Change + $1B a month, pollution
10:20 - Not soloing science, helping you solo code
Livestream: https://www.youtube.com/watch?v=1tQ_KrlHgfg&t=1s
Price: https://grok.com/#subscribe
https://x.com/ArtificialAnlys/status/1943166841150644622
Gemini DeepThink: https://blog.google/technology/google-deepmind/google-gemini-updates-io-2025/#deep-think
https://simple-bench.com/
ARC-AGI 2: https://x.com/arcprize/status/1943168950763950555
Humanity’s Last Exam: https://agi.safe.ai/
SmartGPT: https://www.youtube.com/watch?v=hVade_8H8mE
New Power Plant, 1m GPUs: https://www.tomshardware.com/tech-industry/artificial-intelligence/elon-musk-xai-power-plant-overseas-to-power-1-million-gpus
Gemini 3.0 beta: https://web.archive.org/web/20250709174548/https://github.com/google-gemini/gemini-cli/blob/b0cce952860b9ff51a0f731fbb8a7649ead23530/packages/cli/src/ui/utils/errorParsing.test.ts
Pollution: https://www.theguardian.com/technology/2025/apr/24/elon-musk-xai-memphis
https://www.youtube.com/watch?v=C8rU4dv2w8Q
https://www.youtube.com/watch?v=3VJT2JeDCyw
System Prompt: https://github.com/xai-org/grok-prompts/blob/535aa67a6221ce4928761335a38dea8e678d8501/ask_grok_system_prompt.j2
Burn Rate: https://www.bloomberg.com/news/articles/2025-06-17/musk-s-xai-burning-through-1-billion-a-month-as-costs-pile-up
Ron Johnson: https://x.com/jdcmedlock/status/1939814516503847259
Non-hype Newsletter: https://signaltonoise.beehiiv.com/
Podcast: https://aiexplainedopodcast.buzzsprout.com/
0

Comments

Want to join the conversation?

Loading comments...