
We have been shipping 🛳️❤️ 📦 Community Evals & Benchmark Datasets: Benchmark datasets host benchmark leaderboards, you can now contribute eval results by opening a PR to model repositories, all PRs are fed to benchmark datasets 📦 Chat with datasets: agents live in Data Studio, you can ask questions about datasets 📦 Select sections in datasets: Data Studio now has a spreadsheet-like UX, allowing quick selections 📦 MLX compatibility: Find hardware compatible for MLX models and quantized versions in model repositories 📦 You can now save blog drafts and access them from the editor 📖 📦 Datasets now support LanceDB format 📦 Model repositories show snippets for SGLang
SyGra 2.0.0 launches Studio, a visual IDE for building synthetic data generation workflows. The canvas lets users configure models, data sources, and prompts via drag‑and‑drop, automatically generating the underlying YAML/JSON graph. Studio provides live execution monitoring, token‑cost tracking, and inline...
NVIDIA unveiled the Nemotron ColEmbed V2 family, a set of late‑interaction multimodal embedding models available in 3B, 4B and 8B sizes. The models achieve state‑of‑the‑art results on the ViDoRe V1‑V3 benchmarks, with the 8B variant ranking #1 on ViDoRe V3 (NDCG@10 63.42). They extend the...
Hugging Face launched decentralized evaluation reporting, enabling benchmark datasets to host leaderboards and models to store evaluation scores in .eval_results YAML files. Community members can submit results via pull requests, which appear alongside author scores and are aggregated on dataset...

H Company unveiled the Holo2‑235B‑A22B Preview, its largest UI‑localization model to date, achieving a new state‑of‑the‑art 78.5% accuracy on the Screenspot‑Pro benchmark and 79.0% on OSWorld G. The model is released on Hugging Face as a research preview focused on UI‑element grounding....
The PRX Part 2 post documents a systematic series of ablations on training a 1.2 B‑parameter text‑to‑image diffusion model. Adding representation alignment (REPA) with frozen vision teachers lowered FID by up to three points, while latent‑space alignment (REPA‑E) and the Flux2‑AE tokenizer halved...
NVIDIA released Nemotron‑Personas‑Brazil, an open dataset of six million fully synthetic Brazilian personas grounded in official IBGE census and labor statistics. The collection spans 20 fields, 1.5 k occupation categories, and covers every Brazilian state, delivering culturally authentic Portuguese narratives. Built...
NVIDIA announced three new open‑source models—StormScope, Atlas, and HealDA—under its Earth‑2 portfolio, covering nowcasting, medium‑range, and data assimilation. StormScope delivers kilometer‑scale, zero‑to‑six‑hour storm forecasts that outperform traditional physics models, while Atlas provides high‑accuracy 15‑day global predictions across more than 70...