Amazon's Anti-Benchmark AI Bet
In this episode, host interviews Amazon’s AI chief Rohit Prasad, who argues that the AI community should stop obsessing over benchmark leaderboards and focus on real‑world utility, noting that current evals are noisy and incomparable. He explains Amazon’s contrarian approach, emphasizing consistent training data and held‑out evaluations as the true measure of progress, and hints at upcoming announcements at AWS re:Invent that showcase practical advancements. The conversation also touches on industry reactions, including OpenAI’s “code red” and Anthropic’s new Claude model, underscoring a shift toward performance that matters beyond scores.
"The World Is Not Slowing Down" - AWS CEO Says AI Agents Will Be Bigger than the Internet, so Act...
At AWS re:Invent 2025, CEO Matt Garman proclaimed AI agents will become more transformative than the Internet or cloud computing. He highlighted an accelerating wave of AI innovation, noting that enterprises are already seeing agents reshape customer experiences and operational...
No, It's Not Just You – ChatGPT Is Having some Issues, but OpenAI Is Working on It
On December 2, 2025 OpenAI confirmed a widespread ChatGPT outage caused by elevated error rates, affecting both web and mobile interfaces. Down Detector recorded more than 3,400 reports as users encountered endless loading circles and unresponsive answers. By 3:12 PM ET OpenAI applied a...
Google Is Bringing AI-Powered Notification Summaries to More Android Devices
Google is extending its AI‑powered notification summaries, introduced in Android 16, to Android devices beyond Pixel, starting with Samsung and other OEMs. The feature condenses lengthy chat messages and group conversations into brief snippets, but remains limited to messaging apps,...

The Next Android 16 Update Has Landed – Here Are the 7 Biggest Features
Google has rolled out a second Android 16 update, adding AI‑driven notification summaries, revamped notification organization, and expanded customization options. The package also introduces built‑in parental controls, a beta Call Reason feature, scam‑checking via Circle to Search, and a suite of...
Custom Policy Enforcement with Reasoning: Faster, Safer AI Applications
NVIDIA unveiled Nemotron Content Safety Reasoning, a model that blends dynamic policy reasoning with production‑grade latency. The system lets developers load natural‑language policies at inference time, enabling nuanced content moderation across e‑commerce, telecom, and healthcare use cases. It achieves low...

ChatGPT Referrals to Retailers’ Apps Increased 28% Year-over-Year, Says Report
ChatGPT referrals to retailer mobile apps jumped 28% year‑over‑year during the Black Friday weekend, according to Apptopia. Amazon captured a majority share, rising to 54% of AI‑driven referrals, while Walmart’s share climbed to 14.9%. Despite the growth, AI‑originated sessions remain...
I Get Why some People Are Suddenly Freaking Out About AI Agents in Windows 11 – I'm Worried, Too, but...
Microsoft’s latest Windows 11 update embeds AI agents directly into the OS, offering conversational assistance and task automation. The company cautions users to understand the security implications before enabling these agents, highlighting potential data exposure and privilege escalation. Early adopters...

Mistral Closes in on Big AI Rivals with New Open-Weight Frontier and Small Models
French AI startup Mistral unveiled the Mistral 3 family, comprising a large frontier model (Large 3) with 41 billion active parameters and multimodal, multilingual abilities, plus nine smaller, fully customizable models (Ministral 3). The company positions the open‑weight lineup as a cost‑effective alternative to...

It’s Their Job to Keep AI From Destroying Everything
Anthropic has built a nine‑person societal impacts team to surface "inconvenient truths" about its AI systems. The group developed Clio, an internal analytics platform that tracks real‑time user interactions with Claude, uncovering misuse such as explicit pornographic content and coordinated...

The New Paradigm: A Concentration of Data in AI Demands Greater Vigilance
The article warns that AI’s need for massive, consolidated datasets turns former data silos into dense treasure troves, dramatically expanding the attack surface. More than three‑quarters of enterprises have already suffered AI‑related breaches, exposing them to regulatory penalties and reputational...

AWS Launches Frontier Agents
AWS unveiled Frontier Agents at re:Invent 2025, a new class of autonomous AI agents that can tackle ambiguous, multi‑step problems without constant supervision. The agents are designed to scale, operate for extended periods, and improve through developer interaction. AWS introduced...

DeepSeek Unveils New AI Models Rivalling GPT-5 and Gemini 3 Pro

AI’s Wrong Answers Are Bad. Its Wrong Reasoning Is Worse
Recent studies reveal that large language models (LLMs) often reach correct answers for the wrong reasons, exposing critical reasoning flaws. Researchers introduced the KaBLE benchmark, showing that while newer models exceed 90% accuracy on factual verification, they dip to 62%...
New Training Method Boosts AI Multimodal Reasoning with Smaller, Smarter Datasets
Researchers at MiroMind AI and partner universities introduced OpenMMReasoner, a two‑stage training framework that first fine‑tunes a base vision‑language model on a curated, high‑quality dataset and then applies reinforcement learning to sharpen multimodal reasoning. The approach achieves state‑of‑the‑art performance on...

DRAM It! Raspberry Pi Raises Prices
Raspberry Pi announced price hikes across its single‑board computer lineup, with the 16 GB Raspberry Pi 5 climbing from $120 to $145 and other models seeing $5‑$25 increases. The company also introduced a new 1 GB Raspberry Pi 5 variant priced at $45. CEO...

Syntax Hacking: Researchers Discover Sentence Structure Can Bypass AI Safety Rules
Researchers from MIT, Northeastern and Meta showed that large language models often prioritize sentence structure over meaning, a phenomenon they call “syntax hacking.” By feeding syntactically correct but semantically nonsensical prompts, models like OLMo‑2‑13B‑Instruct still produced domain‑specific answers, revealing a...
AWS Goes Beyond Prompt-Level Safety with Automated Reasoning in AgentCore
At re:Invent, AWS announced major upgrades to its Bedrock AgentCore platform, adding policy enforcement, episodic memory, and evaluation tools powered by automated reasoning. The new policy layer sits between agents and external tools, allowing enterprises to enforce guardrails after an...
With AI Browsers Creating Fresh Security and Privacy Concerns, Norton Neo Is the First to Enter with a Safety-First Approach
Norton has released Neo, the first AI‑native browser built around privacy and security, positioning it against AI‑first competitors like Perplexity and OpenAI. Neo eliminates the need for user prompts by proactively delivering summaries, reminders and context‑aware suggestions directly within the...
With Nova Forge, AWS Gives Companies a Path to Build Foundation-Class Models without GPUs
AWS unveiled Nova Forge, a new service that lets enterprises fine‑tune its Nova 2 foundation models with proprietary data without needing costly GPU clusters. The offering creates custom “Novellas” that retain core reasoning abilities while gaining domain‑specific knowledge, and these...
Arcee Aims to Reboot U.S. Open Source AI with New Trinity Models Released Under Apache 2.0
Arcee AI unveiled Trinity Mini (26B parameters) and Trinity Nano (6B parameters) as the first U.S.-trained open‑weight Mixture‑of‑Experts models released under an Apache 2.0 license. The models are available for free download on Hugging Face and can be accessed via a...
AWS Wants to Make Your AI Agents More Intelligent and More Human
At AWS re:Invent 2025 Amazon announced a suite of upgrades to Amazon Connect aimed at making AI agents sound and act more like humans. The enhancements leverage Nova Sonic speech models to deliver natural, multi‑language conversations with appropriate tone and pacing....
One of Google’s Biggest AI Advantages Is What It Already Knows About You
Google Search VP Robby Stein highlighted that the company’s biggest AI advantage lies in its deep knowledge of individual users, drawing on data from Gmail, Calendar, Drive, and other services. By feeding this personal information into Gemini‑powered products like Gemini...

AWS Wants to Take the Strain Out of Modernizing All Your Old Code - and Ending Tech Debt Quicker than...
AWS announced new agentic AI features for its AWS Transform service at re:Invent 2025, promising to automate legacy code modernization across full‑stack Windows environments. The upgrade claims to cut manual effort by up to 70%, accelerate transformation speeds up to...
Apple AI Chief Steps Down Following Siri Setbacks
Apple’s senior AI executive John Giannandrea announced his departure, ending a seven‑year tenure that began with a mandate to revamp Siri. Amar Subramanya, a former Google and Microsoft AI leader, will assume the role of vice president of AI, reporting...
OpenAI Desperate to Avoid Explaining Why It Deleted Pirated Book Datasets
OpenAI deleted two internal datasets, “Books 1” and “Books 2,” that were scraped from the pirate library LibGen before ChatGPT’s 2022 launch. A federal judge has now ordered the company to produce all internal communications about the deletions, rejecting OpenAI’s claim that...

Generative AI Startup Runway Releases Gen-4.5 Video Model
Runway, the AI‑focused startup, unveiled Gen‑4.5, a new video generation model that creates high‑definition short clips from textual prompts. Built on Nvidia GPUs, the model emphasizes visual accuracy, stylistic control, and consistent character rendering, positioning it for Instagram reels and...

Google's AI-Powered Antigravity IDE Already Has some Worrying Security Issues - Here's What Was Found
Google’s Antigravity IDE, an AI‑first development environment, has been found to allow its coding agent to execute terminal commands automatically under default settings. Researchers at PromptArmor demonstrated prompt‑injection attacks that let untrusted input trigger unintended code execution and file access....

Nvidia Announces New Open AI Models and Tools for Autonomous Driving Research
Nvidia unveiled Alpamayo‑R1, an open‑source vision‑language‑action model designed for autonomous‑driving research, at the NeurIPS conference. The model extends Nvidia’s Cosmos‑Reason architecture, adding reasoning capabilities that simulate common‑sense decision making for vehicles. Nvidia also released the Cosmos Cookbook, a suite of...

AWS Re:Invent 2025: How to Watch and Follow Along Live
AWS re:Invent 2025 kicks off in Las Vegas on December 2, offering a series of free live streams that cover five AI‑focused keynotes and multiple partner showcases. The agenda highlights Amazon’s push into agentic AI, new foundation models, and enhanced security measures...

“I Promise You, You Will Have Work to Do” - Nvidia CEO Jensen Huang Urges Everyone to Use AI as...
Nvidia CEO Jensen Huang told employees to automate every feasible task with AI, branding any effort to curb usage as "insane." His directive came after the company posted another record quarter, yet the stock fell amid investor doubts about sustained...
Construction Workers Are Cashing in on the AI Boom
The rapid AI boom is driving a massive data‑center construction surge, prompting tech giants to pour billions into new facilities. Workers moving into these projects are seeing 25‑30% wage increases, with some supervisors earning over $100,000 and specialists topping $200,000....
DeepSeek Just Dropped Two Insanely Powerful AI Models that Rival GPT-5 and They're Totally Free
DeepSeek, a Chinese AI startup, unveiled two 685‑billion‑parameter models—DeepSeek‑V3.2 and the high‑performance DeepSeek‑V3.2‑Speciale—under an MIT open‑source license. The models employ a novel Sparse Attention architecture that halves inference costs for long‑context tasks, supporting 128,000‑token windows at roughly $0.70 per million...

Responsible AI Center to Combine Research With Industry Know-How
A coalition of universities and industry launched the Center on Responsible Artificial Intelligence and Governance (CRAIG), funded by the U.S. National Science Foundation, to develop methods, tools, and standards for ethical AI deployment. Faculty from Ohio State, Northeastern, Baylor and...
MIT Offshoot Liquid AI Releases Blueprint for Enterprise-Grade Small-Model Training
Liquid AI, an MIT spin‑off, has published a 51‑page technical report that details the architecture, training curriculum, and post‑training pipeline behind its LFM2 family of small, on‑device foundation models. The report reveals a hardware‑in‑the‑loop architecture search that favors gated short...

Your AI-Generated Image of a Cat Riding a Banana Exists because of Children Clawing Through the Dirt for Toxic Elements....
The article highlights the hidden environmental and social toll of large language models, tracing their supply chain from child‑labour‑driven rare‑earth mining in the Congo to energy‑hungry data centres built in water‑scarce regions. It notes that training data curation often exposes...
AI Models Block 87% of Single Attacks, but Just 8% when Attackers Persist
Cisco’s AI Threat Research team discovered that open‑weight large language models block 87% of single‑turn malicious prompts but see attack success soar to 92% when adversaries persist across multiple turns. The study evaluated eight popular models and found multi‑turn success...
The State of AI: Welcome to the Economic Singularity
Generative AI adoption remains highly uneven, with 95% of projects failing to generate returns while coding assistants promise transformative gains. Experts cite a classic productivity paradox, suggesting AI’s economic impact will follow a J‑curve that requires new data infrastructure and...

Runway Says Its New Text-to-Video AI Generator Has ‘Unprecedented’ Accuracy
Runway unveiled its Gen-4.5 text-to-video model, touting unprecedented physical accuracy and visual precision. The new system delivers more realistic motion, weight, and fluid dynamics while better adhering to detailed prompts. Gen-4.5 is being rolled out gradually to all users, maintaining...

AWS Re:Invent 2025 - All the News and Updates as It Happens
AWS re:Invent 2025 kicked off in Las Vegas with live coverage highlighting three marquee keynotes. CEO Matt Garman opened the event outlining Amazon’s multi‑year cloud strategy, followed by AI and Data VP Dr. Swami Sivasubramanian unveiling the next generation of...

Why the Most Impactful AI Strategies Still Start and End with People
AI agents are emerging as powerful productivity tools, but PwC argues they are amplifiers rather than replacements for human talent. Executives view AI agents as a near‑term competitive advantage, with 73 % expecting significant gains, yet success hinges on aligning agents...
Why IBM CEO Arvind Krishna Is Still Hiring Humans in the AI Era
IBM CEO Arvind Krishna acknowledges that early Watson efforts were technologically sound but mis‑timed, prompting a shift toward modular, enterprise‑focused generative AI via the new Watsonx platform. He argues that AI costs will fall dramatically over the next five years...
The Race to AGI-Pill the Pope
A coalition of AI researchers and clergy, dubbed the “AI Avengers,” is urging the Vatican to treat artificial general intelligence (AGI) risks as a serious policy issue. Pope Leo XIV, a math‑trained, tech‑savvy pontiff, has already signaled interest in AI, convening...
OpenAGI Emerges From Stealth with an AI Agent that It Claims Crushes OpenAI and Anthropic
OpenAGI, a stealth startup founded by MIT researcher Zengyi Qin, unveiled Lux, an AI foundation model that autonomously controls computers. Lux achieved an 83.6% success rate on the Online‑Mind2Web benchmark, outpacing OpenAI’s Operator (61.3%) and Anthropic’s Claude Computer Use (56.3%)....
The Download: Spotting Crimes in Prisoners’ Phone Calls, and Nominate an Innovator Under 35
A U.S. telecom firm, Securus Technologies, has deployed an AI model trained on years of inmates' phone, video, and text communications to flag planned criminal activity. The pilot uses a dataset of seven years of Texas prison calls and is...

The Next Frontier in AI Isn’t Just More Data
The AI community is moving beyond larger models and bigger datasets toward reinforcement‑learning (RL) environments that let agents learn by interacting with simulated worlds. Recent investments of billions by Silicon Valley firms are creating these digital classrooms where models can...

Brace Yourself, ChatGPT Fans –an OpenAI Leak Suggests Ads Could Come to Your Conversations Very Soon
OpenAI is reportedly testing an advertising feature in the ChatGPT Android app, as hidden code references an “ads feature,” “bazaar content,” and a “search ads carousel.” The beta version (1.2025.329) suggests a rollout could be imminent, though timing and integration...

Want to Skip the Line for Gemini for Home Access? This Simple Hack Is Working for some Users
Google is gradually rolling out Gemini for Home, an AI‑powered upgrade to its smart‑home platform, but access is limited to a waiting list and primarily US users. A Reddit‑discovered hack lets Android users trigger the update by opening the googlehome://assistant/voice/setup URL in...

When GPT-5 Thinks Like a Scientist
OpenAI’s GPT‑5 is evolving from a powerful search tool into a research collaborator, delivering full solutions to four long‑standing mathematical problems and uncovering hidden links across physics, biology, and computer science. The model’s “compression factor” enabled six hours of AI‑augmented...

Forthcoming Machine Learning and AI Seminars: December 2025 Edition
Lucy Smith’s December 2025 AI seminar roundup lists a series of free, virtual talks covering diverse topics such as optimization for societal impact, AI safety, AI literacy measurement, protein engineering with diffusion models, and the role of third‑party intelligence in markets....