NVIDIA unveiled the Nemotron ColEmbed V2 family, a set of late‑interaction multimodal embedding models available in 3B, 4B and 8B sizes. The models achieve state‑of‑the‑art results on the ViDoRe V1‑V3 benchmarks, with the 8B variant ranking #1 on ViDoRe V3 (NDCG@10 63.42). They extend the ColBERT‑style MaxSim interaction to text‑image token pairs, delivering finer semantic matching at the cost of higher storage. The release targets researchers and enterprises building high‑accuracy visual document retrieval and RAG pipelines.
Hugging Face launched decentralized evaluation reporting, enabling benchmark datasets to host leaderboards and models to store evaluation scores in .eval_results YAML files. Community members can submit results via pull requests, which appear alongside author scores and are aggregated on dataset...

H Company unveiled the Holo2‑235B‑A22B Preview, its largest UI‑localization model to date, achieving a new state‑of‑the‑art 78.5% accuracy on the Screenspot‑Pro benchmark and 79.0% on OSWorld G. The model is released on Hugging Face as a research preview focused on UI‑element grounding....
The PRX Part 2 post documents a systematic series of ablations on training a 1.2 B‑parameter text‑to‑image diffusion model. Adding representation alignment (REPA) with frozen vision teachers lowered FID by up to three points, while latent‑space alignment (REPA‑E) and the Flux2‑AE tokenizer halved...
NVIDIA released Nemotron‑Personas‑Brazil, an open dataset of six million fully synthetic Brazilian personas grounded in official IBGE census and labor statistics. The collection spans 20 fields, 1.5 k occupation categories, and covers every Brazilian state, delivering culturally authentic Portuguese narratives. Built...
NVIDIA announced three new open‑source models—StormScope, Atlas, and HealDA—under its Earth‑2 portfolio, covering nowcasting, medium‑range, and data assimilation. StormScope delivers kilometer‑scale, zero‑to‑six‑hour storm forecasts that outperform traditional physics models, while Atlas provides high‑accuracy 15‑day global predictions across more than 70...
AssetOpsBench is a new benchmark that evaluates agentic AI in industrial asset‑lifecycle management using 2.3 M sensor points, 140+ curated scenarios, 4.2 K work orders and 53 structured failure modes. It scores agents across six qualitative dimensions—task completion, retrieval accuracy, result verification,...
The Differential Transformer V2 (DIFF V2) introduces a differential attention operation that doubles query heads while keeping key‑value heads unchanged, eliminating the need for custom attention kernels. By projecting a per‑token, per‑head λ and applying a sigmoid‑scaled subtraction, DIFF V2 removes the...

Microsoft Research unveiled OptiMind, a specialized language model that converts natural‑language optimization problems into solver‑ready mathematical formulations. The model is released as an experimental offering on Hugging Face, allowing developers and researchers to test it directly in the platform’s playground. OptiMind...
Open Responses is an open‑source inference standard that extends OpenAI’s Responses API, aiming to replace the legacy Chat Completion format for agentic workloads. It unifies text, image, JSON, and video generation while enabling provider‑side tool execution and autonomous sub‑agent loops....
NVIDIA introduced two compact Llama Nemotron models—an image‑text embedding encoder and a cross‑encoder reranker—tailored for multimodal retrieval over visual documents. Both run on typical NVIDIA GPUs, emit a single dense vector per page, and integrate seamlessly with existing vector databases. Benchmarks...

NVIDIA and Hugging Face have merged NVIDIA Isaac Lab‑Arena with the LeRobot EnvHub, creating an open‑source pipeline for evaluating vision‑language‑action (VLA) robot policies in simulation. The integration gives developers access to pre‑trained GR00T N models, a library of 250+ Lightwheel tasks, and...

The Technology Innovation Institute unveiled Falcon H1R 7B, a decoder‑only 7‑billion‑parameter LLM that rivals much larger reasoning models. Leveraging a two‑stage pipeline of curated supervised fine‑tuning and reinforcement learning with the GRPO algorithm, the model excels on math, code, and general...
Falcon‑H1‑Arabic introduces a family of 3B, 7B and 34B parameter models that merge Mamba state‑space modules with Transformer attention in a hybrid block design. The architecture expands context windows to 128K tokens for the 3B model and 256K tokens for...
At CES 2026 NVIDIA demonstrated how its DGX Spark platform can power a personal AI assistant built on the Reachy Mini robot. Using open‑source Nemotron 3 Nano for reasoning and Nemotron Nano 2 VL for vision, the demo combined NVIDIA’s NeMo Agent Toolkit with ElevenLabs TTS to...
Transformers v5 introduces a major redesign of tokenizers, consolidating each model’s tokenizer into a single, transparent file and exposing the full architecture—normalizer, pre‑tokenizer, model, post‑processor, and decoder—as inspectable properties. The library now defaults to a Rust‑based backend for speed, while...

NVIDIA released the Nemotron 3 Nano 30B A3B model alongside a fully open evaluation recipe built with the NeMo Evaluator library. The blog details how developers can reproduce the model‑card results, inspect structured logs, and run the same benchmarks on any inference endpoint. Published YAML...
The llama.cpp server now includes a router mode that enables dynamic loading, unloading, and switching among multiple LLM models without restarting the service. Models are auto‑discovered from the default cache or a user‑specified directory and are launched in separate processes,...
ServiceNow released Apriel-1.6-15B-Thinker, a 15‑billion‑parameter multimodal reasoning model that rivals the performance of models ten times larger. Built on the Apriel‑1.5 foundation, it boosts text and vision reasoning while cutting reasoning token usage by over 30%. Trained on NVIDIA GB200...

Mattt announced swift‑huggingface, a new Swift package that delivers a full‑featured client for the Hugging Face Hub, slated to replace the existing HubApi in swift‑transformers. The library adds reliable, resumable downloads with progress tracking, a Python‑compatible cache, and a flexible...

DeepMath is a math‑reasoning agent built on the Qwen‑3‑4B Thinking model and fine‑tuned with Group Relative Policy Optimization (GRPO). It replaces verbose chain‑of‑thought text with tiny Python snippets that run in a sandboxed executor, then folds the results back into the...
Anthropic’s Claude Code now leverages a new Hugging Face Skills plugin to fine‑tune open‑source large language models end‑to‑end. The skill generates training scripts, selects appropriate cloud GPUs, submits jobs to Hugging Face Jobs, monitors progress via Trackio, and pushes the finished model to the Hub....
NVIDIA unveiled Nemotron Content Safety Reasoning, a model that blends dynamic policy reasoning with production‑grade latency. The system lets developers load natural‑language policies at inference time, enabling nuanced content moderation across e‑commerce, telecom, and healthcare use cases. It achieves low...
Today we release the transformers version 5 RC! 🤗 With this, we enable e2e interoperability with our friends in ecosystem, ease up adding new models and simplify the library 🙌🏻 Read our blog to learn more: https://t.co/ysZW3btRgR https://t.co/PVKFRsH9Z2
The SARLO‑80 dataset aggregates roughly 2,500 Umbra synthetic aperture radar (SAR) scenes and aligns them with high‑resolution optical imagery at a uniform 80 cm slant‑range resolution. Each SAR patch (1,024 × 1,024 px) is co‑registered with an optical counterpart and enriched with English natural‑language...
Hugging Face released Transformers v5.0.0rc‑0, marking a five‑year jump from v4’s initial candidate. Daily pip installs have surged to over 3 million, pushing total installations past 1.2 billion. The library now supports more than 400 model architectures and hosts upwards of 750 k checkpoints,...
we've recently shipped profile status 🫡 drop below what you've built on Hugging Face Hub this weekend!
Flux.1-dev has been the second most liked model on Hugging Face just after Deepseek R1 so super excited to see the release of Flux.2-dev by @bfl_ml today! Download the weights or try the model (thanks to @fal) on @huggingface: https://t.co/kdmVlvdLZh Read the...
The post introduces FLUX.2, Black Forest Labs' latest open‑source image generation model, highlighting its new architecture—including a single Mistral Small 3.1 text encoder and a re‑engineered DiT transformer with more single‑stream blocks and bias‑free layers. It offers practical guidance for...

Tavily’s team detailed how they rebuilt their deep‑research AI agent to achieve state‑of‑the‑art performance. By designing a lightweight agent harness, leveraging evolving model tool‑calling abilities, and integrating an advanced search tool, they streamlined orchestration and context handling. Their context‑engineering approach...
OVHcloud is now an official Inference Provider on the Hugging Face Hub, enabling serverless AI model calls directly from model pages. The service offers pay‑per‑token pricing starting at €0.04 per million tokens and runs on secure European data centers for...
The post introduces new multilingual and long‑form tracks on the Open ASR Leaderboard, highlighting recent trends across 60+ models. It finds that Conformer encoders paired with LLM decoders achieve the best English accuracy, while CTC/TDT decoders offer the highest speed,...
The post announces the integration of RapidFire AI with Hugging Face TRL, enabling up to 20× faster fine‑tuning and post‑training experiments by running multiple configurations concurrently on a single GPU through adaptive chunk‑based scheduling. It highlights drop‑in TRL wrappers, real‑time...
BOOM! Olmo 3 has just landed, join us in this livestream to learn more about the release 🤗💗

The post announces AnyLanguageModel, a Swift package that lets Apple developers swap the Foundation Models import for a unified API supporting local (Core ML, MLX, llama.cpp, Ollama) and cloud (OpenAI, Anthropic, Gemini, Hugging Face) LLM providers with minimal code changes. By...

ServiceNow‑AI converted its 15 B attention‑based reasoning model into a hybrid Mamba architecture, achieving 2.1× throughput with negligible quality loss. The breakthrough came from distilling on the teacher’s high‑quality SFT reasoning traces rather than generic pretraining data, and using reverse KL...
EvE Bio has released the “pharmome map,” the largest public drug‑target interaction dataset to date. It measures 1,397 FDA‑approved small‑molecule drugs against nuclear receptors, GPCRs, and protein kinases using standardized high‑throughput assays. The dataset is openly accessible, refreshed bi‑monthly, and...
The post explains how to build, test, and share ROCm‑compatible GPU kernels—specifically a high‑performance FP8 GEMM kernel—using Hugging Face’s kernel‑builder and kernels libraries, with a focus on reproducible builds via a flake.nix environment. It walks through project layout, configuration files...

This might be the biggest AI hackathon ever: * >6,300 registrants * Runs for 2 weeks (Nov. 14-30) * Open to anyone, anywhere virtually * $20,000 in cash prizes + $3.5M+ in sponsor credits Hosted by @Anthropic and @Gradio, along with 10 sponsors, join...

AMD, Hugging Face, and Data Monsters are launching the AMD Open Robotics Hackathon with in‑person events in Tokyo (December 5‑7, 2025) and Paris (December 12‑14, 2025). Teams of up to four participants will complete two missions—setting up the LeRobot development environment and creating a...
The post announces a deeper partnership between Hugging Face and Google Cloud aimed at making it easier for companies to build and customize AI using open models. It highlights integrated services such as Vertex AI Model Garden, GKE, Cloud Run,...
Best practices for testing your Gradio app 👇
NVIDIA released Isaac for Healthcare v0.4 with an end‑to‑end SO‑ARM starter workflow that takes developers from mixed simulation and real‑world data collection through fine‑tuning and real‑time deployment of surgical assistant robots. The pipeline fine‑tunes GR00T N1.5 on predominantly synthetic data...
Hugging Face researchers propose a “voice consent gate” to allow voice cloning only after an explicit, context‑specific spoken consent, and provide a demo and modular code to demonstrate the approach. The system combines autogenerated consent sentences, automatic speech recognition to...
Hugging Face has released huggingface_hub v1.0 after five years of development, positioning the library as the mature Python backbone for its Hub and the broader ML ecosystem. The package now powers roughly 200,000 dependent libraries and provides access to more...

Hugging Face announced major backend improvements to its datasets and huggingface_hub libraries that make streaming multi‑TB training data far more efficient and reliable using the same load_dataset(streaming=True) API. Changes—including a persistent data files cache, optimized resolution logic, Parquet prefetching and...
Hugging Face released LeRobot v0.4.0, a major upgrade to its open‑source robotics stack that introduces LeRobotDataset v3.0 (chunked episodes and streaming to handle OXE‑scale datasets >400 GB), new VLA models (PI0.5 and GR00T N1.5), and a plugin system for easier...
Meta and Hugging Face launched the OpenEnv Hub, an open community repository and 0.1 RFC standard for “agentic environments” that package tools, APIs, credentials and execution context into secure, sandboxed interfaces for training and deployment. The Hub—seeded with initial environments...
Hugging Face announced it is formally taking stewardship of the popular open‑source Sentence Transformers library — maintained by Tom Aarsen since 2023 — transitioning the project from TU Darmstadt’s UKP Lab to Hugging Face while retaining its Apache 2.0 license...
Hugging Face has partnered with VirusTotal to continuously scan all 2.2M+ public model and dataset repositories on the Hugging Face Hub, checking file hashes against VirusTotal’s threat‑intelligence database to surface prior detections and related metadata. The integration retrieves status (clean...