Show HN: Needle: We Distilled Gemini Tool Calling Into a 26M Model
Needle is an open‑source 26‑million‑parameter Simple Attention Network that replicates Gemini 3.1's tool‑calling capabilities. The model was distilled from Gemini 3.1 using 200 billion tokens on 16 TPU v6e cores in 27 hours, then fine‑tuned on a 2‑billion‑token function‑call dataset in 45 minutes. It runs locally on consumer hardware at 6,000 tokens‑per‑second pre‑fill and 1,200 tokens‑per‑second decode, and its weights and data generation pipeline are publicly available. Benchmarks show Needle outperforms larger models such as FunctionGemma‑270 M and Qwen‑0.6 B on single‑shot function‑call tasks, though larger models remain stronger in open‑ended conversation.
Hardware Attestation as Monopoly Enabler
Apple and Google are progressively extending hardware‑based attestation, embedding it in services such as Play Integrity and App Attest. The APIs now require certified devices, effectively barring alternative operating systems like GrapheneOS and limiting competition. Governments, especially in the EU...
Internet Archive Switzerland
The Internet Archive has opened a new Swiss non‑profit foundation, Internet Archive Switzerland, headquartered in St. Gallen. The branch will operate independently, focusing on rescuing endangered archives and beginning a generative‑AI model archive in partnership with the University of St. Gallen. Its...
EU Calls VPNs "a Loophole that Needs Closing" In Age Verification Push
The European Parliamentary Research Service warned that virtual private networks are being used to sidestep newly‑mandated online age‑verification systems, labeling VPNs a regulatory loophole. The report notes a sharp rise in VPN app downloads in the UK and other jurisdictions...
Guitar Tuner that Uses Phone Accelerometer
A new guitar‑tuning app turns a smartphone into a vibration sensor by pressing the device against the guitar body and reading raw accelerometer data. The app calculates the combined magnitude |a| and selects the strongest axis, applying alias‑correction to extract...
Maybe You Shouldn't Install New Software for a Bit
Recent disclosures have added two serious Linux kernel flaws—Copy Fail 2 and Dirty Frag—to an already crowded vulnerability landscape. Both exploits target low‑level kernel code, raising the likelihood of privilege‑escalation attacks. Security analysts warn that the timing is ripe for a...
Natural Language Autoencoders: Turning Claude's Thoughts Into Text
Anthropic unveiled Natural Language Autoencoders (NLAs), a technique that converts a language model’s hidden activations into human‑readable text. By training an activation verbalizer and a reconstructor in a round‑trip fashion, NLAs generate explanations that can be checked for fidelity via...
Agents Need Control Flow, Not More Prompts
The article argues that reliable AI agents tackling complex tasks require deterministic control flow encoded in software rather than increasingly elaborate prompt chains. Prompt engineering becomes non‑deterministic and hard to verify as tasks grow, leading to hallucinations and silent failures....
DeepSeek 4 Flash Local Inference Engine for Metal
The ds4.c project delivers a native Metal‑only inference engine tailored for DeepSeek V4 Flash, bypassing generic GGUF runners and external runtimes. It leverages a custom graph executor, KV‑state handling, and a server API to achieve high throughput—up to 84 tokens / second prefill...
Idempotency Is Easy Until the Second Request Is Different
The piece argues that true idempotency goes far beyond a simple replay cache; the real difficulty lies in handling a second request that arrives while the first is still processing or carries different data. It outlines scenarios such as concurrent...
Mythical Man Month
In the early 1960s Fred Brooks oversaw IBM’s System/360 development and later codified his insights in the 1975 classic *The Mythical Man‑Month*. The book introduced Brooks’s law—adding manpower to a late software project makes it later—highlighting the exponential growth of...
Cat (YC S22) Seeks Fractional Engineer to Build AI-Native Growth Toolkit
Cat, a Y Combinator S22 graduate that positions itself as the world’s AI‑native insurance broker, is hiring a fractional engineer to build a growth‑focused toolkit. The role spans designing AI‑driven growth tools, orchestrating on‑the‑ground canvassing, and handling operational hurdles to...
Reverse-Engineering the 1998 Ultima Online Demo Server
After a decade of intermittent work, developer draxinar released a complete reverse‑engineering of the 1998 Ultima Online demo server on GitHub, translating roughly 5,000 disassembled MSVC x86 functions into portable C99. The project reproduces the original server logic, fixes stability...
StarFighter 16-Inch
The StarFighter is a 16‑inch Linux‑focused laptop that blends premium materials with high‑end hardware, offering Intel Core Ultra or AMD Ryzen 9 CPUs and up to 64 GB of LPDDR5X memory. It features a 4K 120 Hz IPS display, a removable magnetic webcam,...
CVE-2026-31431: Copy Fail Vs. Rootless Containers
The article dissects CVE‑2026‑31431, a kernel privilege‑escalation bug dubbed “Copy Fail,” which corrupts the page‑cache of /usr/bin/su to execute a tiny ELF payload that calls setuid(0) and execve("/bin/sh"). The author reproduces the exploit on a vulnerable Fedora 43 VM (kernel 6.17.1)...
Pulitzer Prize Winner in International Reporting
The Associated Press team of Dake Kang, Garance Burke, Byron Tau, Aniruddha Ghosal and Yael Grauer won the 2026 Pulitzer Prize for International Reporting. Their series, published throughout 2025, documents how Silicon Valley technology enabled mass detention, surveillance, and AI‑driven...
How OpenAI Delivers Low-Latency Voice AI at Scale
OpenAI re‑engineered its real‑time voice AI infrastructure by separating the WebRTC stack into a lightweight relay and a stateful transceiver. The relay uses the ICE username fragment to route media to the owning transceiver while keeping a minimal public UDP...
The Thinking Plant's Man (2025)
In August 1926 Jagadish Chandra Bose demonstrated electrical activity in snapdragon stems, presenting a plant "heartbeat" to a packed audience of British scientists. His elaborate apparatus recorded sap and electrical responses to stimulants, arguing that plants possess a nervous system comparable to...
Does Employment Slow Cognitive Decline? Evidence From Labor Market Shocks
A new working paper provides causal evidence that local labor‑demand shocks reduce cognitive performance among older men. Using a Bartik instrument on Health and Retirement Study data, the authors find that negative employment shocks lower cognitive scores for men aged...
The Bottleneck Was Never the Code
At .txt the author ran an experiment comparing structured‑generation AI algorithms with open‑source counterparts, showing that a coding agent can produce a working prototype in a few hours. The result highlights that the real bottleneck has moved from writing code...
CARA 2.0 – “I Built a Better Robot Dog”
CARA 2.0 is a revamped quadrupedal robot that targets hobbyists and researchers with a sub‑$1,500 price tag, roughly half the cost of its predecessor. By redesigning the actuator stack, rewinding low‑cost TYI motors, and using inexpensive XDrive controllers, the team cut...
DeepClaude – Claude Code Agent Loop with DeepSeek V4 Pro, 17x Cheaper
Claude Code, Anthropic’s premier autonomous coding agent, traditionally costs $200 per month with usage caps. The new open‑source tool deepclaude swaps the underlying model for DeepSeek V4 Pro, which charges $0.44 per million input tokens and $0.87 per million output tokens—roughly...
Kimi K2.6 Just Beat Claude, GPT-5.5, and Gemini in a Coding Challenge
In the 12th AI Coding Contest, Moonshot AI’s open‑weights Kimi K2.6 topped the Word Gem Puzzle, earning 22 match points with a 7‑1‑0 record. Xiaomi’s MiMo V2‑Pro placed second, while OpenAI’s GPT‑5.5 took third and Anthropic’s Claude Opus 4.7 fell to...
Clandestine Network Smuggling Starlink Tech Into Iran to Beat Internet Blackout
A covert network is smuggling Starlink satellite‑internet terminals into Iran to bypass the government’s prolonged internet blackout, which began after the Feb. 28 airstrikes. The operation, funded by Iranians abroad, has already moved at least a dozen devices since January, with...
Neanderthals Ran 'Fat Factories' 125,000 Years Ago (2025)
Archaeologists studying the Neumark‑Nord 2 site in central Germany have uncovered evidence that Neanderthals, 125,000 years ago, deliberately crushed the bones of at least 172 large mammals to extract calorie‑dense bone grease. The process involved heating fragmented bones in water, creating a...
An Open Letter Asking NHS England to Keep Its Code Open
An open letter signed by nine tech and health professionals urges NHS England to reverse its recent decision to hide the source code of all its repositories. The signatories argue that open‑source development enforces higher quality, proactive security, and resilience,...
Ask HN: Who Is Hiring? (May 2026)
Today's Hacker News “Who is hiring?” thread showcases a wave of tech companies posting senior‑level openings, many centered on AI, Kubernetes, and formal verification. Roles range from full‑stack and systems engineers to specialized positions in biotech, fintech, and secure remote...
GhostBox – Disposable Little Machines From the Global Free Tier.
GhostBox is a CLI‑driven service that spins up short‑lived Ubuntu VMs from the Global Free Tier, delivering SSH access, Cloudflare tunnels, Tor backups, and public preview URLs with a default 89‑minute time‑to‑live. Users can launch a machine with a single...
Running Adobe's 1991 PostScript Interpreter in the Browser
A hobbyist project called retro‑ps extracts the 1991 Adobe PostScript interpreter from HP's C2089A cartridge and runs it in a browser via a Motorola 68000 emulator. The ROM, originally designed for LaserJet II/III printers, renders .ps files client‑side without any server involvement....
Discovering Hard Disk Physical Geometry Through Microbenchmarking (2019)
The 2019 study by Henry Wong demonstrates how microbenchmarks can uncover the physical geometry of modern hard‑disk drives, from rotation speed and sector angles to track boundaries and surface count. By timing precise read sequences, the author measured RPM, seek...
Tar Files Created on macOS Display Errors When Extracting on Linux (2024)
Developers who create tar.gz archives on macOS often encounter duplicate "._" files and extended‑attribute warnings when extracting them on Linux servers. The BSD‑based tar on macOS automatically embeds Apple‑specific xattr metadata, which Linux's GNU tar cannot interpret. Adding the "--no-xattrs"...
Where the Goblins Came From
Starting with GPT‑5.1, OpenAI observed a sharp rise in whimsical creature metaphors—especially goblins and gremlins—within model outputs. The spike traced back to the “Nerdy” personality, whose reward signal unintentionally favored such language, causing a 175% increase in goblin mentions and...
Alignment Whack-a-Mole: Finetuning Activates Recall of Copyrighted Books in LLMs
A new arXiv paper reveals that finetuning large language models on copyrighted book excerpts causes the models to reproduce verbatim passages, sometimes spanning hundreds of words. The authors provide an open‑source pipeline for preprocessing EPUBs, finetuning via OpenAI, Vertex AI and...
The Zig Project's Rationale for Their Firm Anti-AI Contribution Policy
Zig Software Foundation has instituted a strict ban on any LLM‑generated content for issues, pull requests, and comments, emphasizing human‑centric contribution growth. The policy, dubbed “contributor poker,” treats the contributor’s potential as the bet rather than the code’s polish. Bun,...
CJIT: C, Just in Time
CJIT (C, Just in Time) is a sub‑2 MB runtime that lets developers execute C programs instantly on Windows, macOS and Linux. It eliminates the need for a full compiler, IDE, or licensing agreements by compiling source code on the fly....
A Playable DOOM MCP App
Developer Chris Nager released a minimal Model Context Protocol (MCP) app that lets users launch the classic first‑person shooter DOOM directly inside AI chat clients such as ChatGPT and Claude, falling back to a standard web URL when inline rendering...
Waymo in Portland
Waymo announced the start of manual‑driving operations in Portland, marking the first step toward a full autonomous ride‑hail service. The company is collaborating with city officials, state regulators, and community partners to forge a regulatory pathway. By manually piloting its...
Claude.ai Unavailable and Elevated Errors on the API
Anthropic confirmed a service outage affecting Claude.ai, Claude Console, the API, Claude Code, Claude Cowork, and Claude for Government from 17:34 to 18:52 UTC on April 28, 2026. The incident triggered elevated authentication errors and prevented users from logging in. Anthropic identified the...
Noctua Releases Official 3D CAD Models for Its Cooling Fans
Cooling‑fan specialist Noctua has made official 3D CAD models of its entire fan portfolio available for download on its website. The models faithfully reproduce external dimensions and mounting points, while internal impeller geometry is slightly altered to protect IP. Noctua...
Patch Applies Fake Diffs From Commit Messages
GitHub’s .patch export includes any diff‑shaped text found in a commit message, not just the actual changes. When fed to GNU patch, this embedded “phantom” diff is applied as a real change, creating files that never existed in the commit. The...
TurboQuant: A First-Principles Walkthrough
TurboQuant introduces a metadata‑free vector compression technique that quantizes each coordinate to 2–4 bits while preserving accuracy. By applying a single random rotation to high‑dimensional embeddings, the method creates a universal marginal distribution, allowing a pre‑computed Lloyd‑Max codebook to be...
When the Cheap One Is the Cool One
Apple’s new MacBook Neo, priced around $500 for education buyers, is selling faster than the flagship Air and Pro, attracting both first‑time Mac users and existing owners. The laptop reuses an older iPhone chip, trims premium features, and adds fresh colors,...
Three Constraints Before I Build Anything
The author outlines three disciplined constraints to apply before any product build: a one‑page brief that caps complexity, a separable core technology that creates reusable IP, and a single defining constraint that gives the product a clear identity. Each rule...
Using Coding Assistance Tools to Revive Projects You Never Were Going to Finish
The author used Claude Code, an AI coding assistant, to rebuild a personal FastAPI shim that exposes YouTube Music through the OpenSubsonic API. Starting from a minimal repository, Claude generated stubbed endpoints, added search via ytmusicapi and streaming via yt‑dlp,...
An Update on Recent Claude Code Quality Reports
Anthropic identified three separate changes that temporarily degraded Claude Code's performance and fixed them by April 20. The first was a March 4 switch of the default reasoning effort from high to medium, which reduced intelligence and was reverted on April 7. A...
Palantir Employees Are Starting to Wonder if They're the Bad Guys
Palantir’s software is now a core component of the Trump administration’s immigration‑enforcement operations, prompting a wave of employee unease. After the killing of nurse Alex Pretti, staff flooded internal Slack channels demanding clarity on the company’s ICE contract and audit‑log controls....
Bitwarden CLI Compromised in Ongoing Checkmarx Supply Chain Campaign
Socket Research uncovered a coordinated supply‑chain campaign affecting multiple development ecosystems. Malicious artifacts were found in the official Checkmarx KICS Docker repository, while Namastex.ai npm packages were infected with a CanisterWorm‑style payload. In parallel, 108 Chrome extensions were linked to...
UK Biobank Health Data Keeps Ending up on GitHub
UK Biobank has been using copyright takedown notices to remove health‑related data from GitHub, filing 110 requests since July 2025. The notices mainly target specific files such as Jupyter/R notebooks, genomic datasets, and CSV tables, rather than whole repositories. Developers...
Nobody Got Fired for Uber's $8M Ledger Mistake?
Uber built its core ledger on DynamoDB in 2017, overlooking the database’s consumption‑based pricing and limited global consistency. The design generated roughly $5 million in write costs and $3 million in storage, totaling an $8 million bill over two years. By 2020 Uber...
XOR'ing a Register with Itself Is the Idiom for Zeroing It Out. Why Not Sub?
Zero‑ing a register on x86 is most efficiently done with the xor r,r idiom because it avoids encoding a four‑byte immediate. Although sub r,r produces the same result and clears all flags, xor r,r became the dominant pattern after early compilers adopted it....