
Researchers at Tokyo's University of Electro‑Communications introduced personality traits and interruption capabilities into large language model agents, allowing them to speak out of turn and react in real time. By comparing fixed, dynamic, and interruption‑enabled conversation flows on the MMLU benchmark, they found accuracy improvements from 68.7% to 79.2% on single‑error tasks and from 37.2% to 49.5% on double‑error tasks. The study shows that more human‑like, messy dialogue can enhance collective AI reasoning. The team plans to apply these "digital personalities" to collaborative domains.

Humanity’s Last Exam, a PhD‑level benchmark launched in Jan 2025, tests AI models on 2,500 unambiguous, non‑searchable questions across 100+ subjects. Google’s Gemini 3 Deep Think currently leads with a 48.4% score, while OpenAI’s o1 lagged at 8.3% and human experts average around...

Researchers at Aalto University warn that AI-driven voice analysis can extract sensitive personal data—from political views to health conditions—simply from speech patterns. Their study, published in IEEE Proceedings, highlights risks such as price‑gouging, discriminatory profiling, and stalking if corporations or...

A secret 2025 meeting of leading mathematicians tested OpenAI’s new o4‑mini model, which delivered proofs that sounded convincingly rigorous. Experts, including Terry Tao, warned that the AI often appears correct while containing subtle errors that are hard for humans to...

AI-powered griefbots, or "deathbots," let users recreate deceased loved ones by training large language models on personal communications. A Chinese content creator, Roro, built a chatbot of her mother that helped her process loss and attracted followers on Xiaohongshu. Companies...

A prospective, population‑based MASAI trial in Sweden screened over 100,000 women using a commercially available AI system alongside radiologists. The AI‑assisted workflow identified more clinically relevant breast cancers and cut interval‑cancer rates without raising false‑positive alerts. Radiologists read AI‑flagged cases...

Researchers led by Ricky J. Sethi have introduced a mathematical framework that gives large language models a metacognitive state vector, enabling them to monitor and regulate their own reasoning. The vector captures five dimensions—emotional awareness, correctness evaluation, experience matching, conflict...

The Bulletin of the Atomic Scientists moved the Doomsday Clock to 85 seconds before midnight, the closest point ever, citing escalating nuclear tensions, stalled climate action, and unregulated AI and synthetic “mirror life.” The report warns that the United States,...

Researchers at Japan's University of Electro‑Communications found that large language model chatbots can spontaneously develop distinct personalities after minimal prompting. By exposing AI to varied conversation topics, the models exhibited social tendencies that aligned with Maslow’s hierarchy of needs, storing...

AI‑generated text is spreading across education, advertising and other sectors, prompting a surge in detection tools. The article outlines three main approaches—learning‑based classifiers, statistical probability tests, and vendor‑provided watermarks—each with distinct trade‑offs. It highlights that detectors quickly become outdated as...

A recent study in the Journal of Creative Behavior argues that AI’s creative capacity is capped at the level of an average human, never reaching professional standards. The research, led by University of South Australia professor David Cropley, applies a...

Google Research has unveiled Project Suncatcher, a study exploring satellite constellations equipped with AI accelerators powered by solar energy as a potential off‑world data‑center solution. The proposal arises as global data‑center electricity use already reaches about 415 TWh in 2024, representing...

A recent Royal Society Open Science study shows that AI‑generated faces are indistinguishable from real ones, even for super recognizers, who performed no better than chance. Typical observers performed worse than chance, routinely mistaking fakes for genuine photos. A brief...