
Breaking: Autonomous Agents Are a Shitshow
A new study of 847 autonomous agent deployments across healthcare, finance, customer service and code‑generation found that 91% are vulnerable to subtle tool‑chaining attacks, while 89.4% exhibit goal drift after roughly 30 steps. Memory‑augmented agents are especially at risk, with 94% susceptible to poisoning. The research also documents a real‑world breach where 770,000 live agents were compromised via a single database exploit, underscoring that agentic systems are far more vulnerable than stateless LLMs.

The Growing AI Backlash
The author argues that a broad backlash against generative AI is intensifying, citing movements like QuitGPT and media coverage such as Fortune’s recent piece. He lists a range of societal harms—erosion of education, disinformation, deep‑fake porn, bias, and energy‑intensive data...

Have LLMs Improved Patient Outcomes?
The post argues that large language models (LLMs) have yet to demonstrate measurable improvements in patient health outcomes. It references Eric Topol’s review and a Nature Medicine editorial, both noting a paucity of clinical evidence despite hype. While LLMs can...

“A Model that Produces Code Which Compiles and Passes the Tests It Was Given Is Not the Same as a...
OpenAI President Greg Brockman recently claimed that AI now writes about 80% of the company’s code, a statement that sparked widespread attention. A counterpoint highlighted in The Next Web emphasizes that a model that merely compiles and passes given tests...

Three Thoughts on the Musk-OpenAI Lawsuit
Elon Musk has filed a lawsuit accusing OpenAI of breaching its original nonprofit promise by transitioning to a for‑profit model. The author expresses distrust of both parties, noting Musk could benefit financially if OpenAI loses, while also acknowledging Musk’s substantive...

ChatGPT Doesn’t Know Its Whisk From Its Elbow
OpenAI’s new multimodal ChatGPT can generate captions for images, but it still falters on functional understanding. A recent example shows the model labeling a kitchen whisk as an elbow in a human‑anatomy diagram. The mistake highlights that the system recognizes...

ChatGPT's “Powerful New Image Engine”
OpenAI’s latest multimodal model can generate striking bike illustrations, but it still mislabels components and lacks functional understanding. In a test, the system confused rear brakes with seat stays and placed a derailleur inside the wheel hub. A custom request...

Please Don’t Trust Your Chatbot for Medical Advice
Recent peer‑reviewed studies across BMJ, JAMA Network Open, and Nature Medicine reveal that popular AI chatbots—including ChatGPT, Gemini, and Meta AI—frequently generate inaccurate, hallucinated, or overconfident medical advice. The BMJ audit found nearly half of responses were highly problematic, while...

Claude Mythos, Evaluated
The UK AI Security Institute evaluated the unreleased Claude Mythos Preview and found it to be the first model to complete an end‑to‑end cyber‑range assessment. Unlike earlier models that could only handle beginner‑level tasks in 2023, Mythos can autonomously compromise...

The Biggest Advance in AI Since the LLM
Anthropic’s Claude Code is being billed as the most significant AI breakthrough since large language models, because it fuses a neural language model with a 3,167‑line deterministic kernel called print.ts. The kernel implements 486 IF‑THEN branches and 12 levels of nesting to...

Three Reasons to Think that the Claude Mythos Announcement From Anthropic Was Overblown
Anthropic’s Claude Mythos announcement generated headlines, but three analysts argue the hype is overstated. First, the demo ran without browser sandboxing, making it a limited proof of concept rather than a real‑world threat. Second, inexpensive open‑weight models replicated the same...

The Back Story Behind the First “$1.8 Billion” Dollar “AI Company”
The New York Times reported that Medvi, an AI‑driven startup, claimed a $1.8 billion valuation after just two months of solo effort and a $20 k bootstrap. The story quickly went viral as a showcase of AI’s ability to compress years of building into...

The Two Wildest Stories Today in Tech
Microsoft’s AI chief Mustafa Suleyman announced a semantic shift, redefining "superintelligence" from a futuristic, human‑surpassing concept to practical AI models that generate product value for millions of enterprises. The same day, OpenAI disclosed a $250 million acquisition of the eighteen‑month‑old podcast...

On Employment, Don’t Panic – Yet.
A recent Fortune column notes that AI’s impact on productivity and return on investment remains modest, despite widespread corporate spending. The author advises employers to stop hunting for human replacements and instead leverage AI to amplify the capabilities of their...

In the Iran War, It Looks Like AI Helped with Operations, Not Strategy
A diplomat’s off‑the‑record remarks suggest the United States relied on artificial intelligence for tactical tasks during the Iran‑related conflict, but the technology fell short on strategic planning. The US misread Iran’s resilience, overestimated regime‑change prospects, and failed to anticipate Tehran’s...
