The "thinking" bar for GPT 5.2 Pro has got a lot better, you can see it looking at the pages etc of what goes in, interrupt it etc However it remains on the web version only (along with the ability to flip to high reasoning), makes the app version not v useful :( https://t.co/AWArRKBJEU
The Last Economy is number 1 on Amazon's Best Sellers for AI with a pretty good ratings ✍️ No publisher or big campaign, also available for free on the website and paperback just dropped 📖 Thank you all for the support &...
3b (active) parameters is all you need
Massive achievement by @NousResearch To make this clear: this is actually a 3b active parameter model that works on any new MacBook Air/Mini that would be #2 on Putnam, which is harder than IMO Now think of all the tasks that...
EU countries should get together to build an advisory AI that should run the EU with all decisions being transparent and participatory. This should be a fully open source, open data stack with all inputs also available so you can run...
68% on SWE-Bench Verified on just 24b! Laptop class 72.2% on 124b 👏 Great job @MistralAI team, looking forward to trying it out https://t.co/I9VlzwMnwB
3 year anniversary of ChatGPT Where will we be 3 years from now
So many amazing new video models coming, we are heading next year to video pixel generation being “solved”
Claude Opus 4.5 is the first Claude I think is reasonably usable for decent math work (Claude interface is great for iterating minus the timeouts & mobile slowdowns) The big thing here though that I've noted using Opus 4.5 usage is...
@zephyr_z9 TPUs are much more stable on 8 bit training (aqt etc) than NVIDIA chips at massive scale Previous gen was a bit sensitive on topology but looks like less of an issue for ironwood
@_The_Prophet__ TPUs had low availability for ages and also low memory relatively on the v6e especially versus the hoppers working pretty much out of the box similar to a100s Grace Blackwell is the next thing that needs reworking so there is...
@_The_Prophet__ TPUs have been more stable for training than CUDA equivalents for a couple of years now, especially on large batch sizes XLA is pretty good now! For inference it makes even less of a difference (We previously trained sota models on thousands...
Lots of improvements! Claude still crawls and burns battery on mobile iOS app when doing long replies, very frustrating
Humanity as the biological bootloader of AGI
This looks crazy good, will run evals now on our sota II-Agent The cost and speed of grok fast 4.1 are way higher than comparable agentic AI models, even top notch ones from the numbers here - 10-20x better in some...
Music team @StabilityAI is amazing, was a privilege to build it up and see how they came together It’ll be super interesting to see how music evolves with generative AI from technology to form to expression itself

Buried the lede a bit but our fully open source II Agent framework is now state of the art in Terminal Bench 2 using just Gemini 3! Congrats to team for amazing work & more coming the pipeline The best agents will...
The most interesting thing testing Gemini 3 Pro has been how *efficient* it is from tokens to tool calls The intelligence per token of models is increasingly rapidly even as prices fall, its quite something
Just call it the Gabecube Hardware looks to be around AMD Ryzen Al Max+ 395 (strix halo) level Would be cool if RAM was upgradeable, can run 128 Gb for LLMs (see @FrameworkPuter Desktop)
Will continuous learning for AI models be solved within 2 years
No current AI systems have morals explicitly encoded into them at pretraining time. At the very least they should have Asimov's laws of robotics eh
Proud of the @ii_posts team who have made a fully open stack for our agentic future that is truly state of the art, from datasets to models to agents We have done 0 press on this preferring to build & soon...
Can you imagine being a "frontier" lab that's raised like a billion dollars and now you can't release your latest model because it can't beat @Kimi_Moonshot ? 🗻 Sota can be a bitch if thats your target
Necessity is the mother of invention Also - training optimally on small amounts of chips with focus on data means the Chinese models take 10-100x less compute to run as well & have that cost advantage $150/mGPT 4.5 vs $0.5/m DeepSeek v3...