
Hugging Face Releases TRL v1.0: A Unified Post-Training Stack for SFT, Reward Modeling, DPO, and GRPO Workflows
Hugging Face announced TRL v1.0, turning its reinforcement‑learning library into a production‑ready stack for large‑language‑model post‑training. The release bundles Supervised Fine‑Tuning, Reward Modeling and alignment into a single, config‑driven workflow accessed via a new command‑line interface. It adds support for popular efficiency techniques such as LoRA, QLoRA and Unsloth, and introduces a stable core alongside an experimental namespace for emerging methods like ORPO. The library now scales from a single GPU to multi‑node clusters through Hugging Face Accelerate.

Liquid AI Released LFM2.5-350M: A Compact 350M Parameter Model Trained on 28T Tokens with Scaled Reinforcement Learning
Liquid AI unveiled LFM2.5-350M, a 350‑million‑parameter model trained on 28 trillion tokens using scaled reinforcement learning. The hybrid architecture combines Linear Input‑Varying (LIV) convolution blocks with a handful of Grouped Query Attention layers, enabling a 32k context window while keeping memory...

How to Build and Evolve a Custom OpenAI Agent with A-Evolve Using Benchmarks, Skills, Memory, and Workspace Mutations
The tutorial walks through building a custom OpenAI‑powered agent using the open‑source A‑Evolve framework in Google Colab. It shows how to set up the repository, define a strict system prompt, create a miniature benchmark of text‑transformation tasks, and implement a mutation...

Microsoft AI Releases Harrier-OSS-V1: A New Family of Multilingual Embedding Models Hitting SOTA on Multilingual MTEB V2
Microsoft released Harrier-OSS-v1, a trio of multilingual embedding models ranging from 270 million to 27 billion parameters. The models use decoder‑only architectures with last‑token pooling and achieve state‑of‑the‑art results on the Multilingual MTEB v2 benchmark. They support a 32,768‑token context window, enabling...

Salesforce AI Research Releases VoiceAgentRAG: A Dual-Agent Memory Router that Cuts Voice RAG Retrieval Latency by 316x
Salesforce AI Research unveiled VoiceAgentRAG, an open‑source dual‑agent architecture that separates retrieval from generation for voice assistants. The Fast Talker foreground agent consults a semantic in‑memory FAISS cache with ~0.35 ms latency, while the Slow Thinker background agent predicts upcoming topics...

Agent-Infra Releases AIO Sandbox: An All-in-One Runtime for AI Agents with Browser, Shell, Shared Filesystem, and MCP
Agent-Infra unveiled the open‑source AIO Sandbox, a unified container that bundles a Chromium browser, Bash shell, Python and Node runtimes, plus VSCode Server and Jupyter notebooks. The platform introduces a shared filesystem that instantly propagates files between tools, eliminating the...

Meet A-Evolve: The PyTorch Moment For Agentic AI Systems Replacing Manual Tuning With Automated State Mutation And Self-Correction
Amazon researchers unveiled A‑Evolve, an open‑source framework that automates the creation and refinement of autonomous AI agents. By treating an agent as a mutable file‑based workspace, the system replaces manual prompt‑tuning with a five‑stage evolution loop—Solve, Observe, Evolve, Gate, Reload—backed...

Chroma Releases Context-1: A 20B Agentic Search Model for Multi-Hop Retrieval, Context Management, and Scalable Synthetic Task Generation
Chroma unveiled Context-1, a 20 billion‑parameter agentic search model built to serve as a dedicated retrieval subagent in RAG pipelines. The model decomposes complex queries, runs multiple tool calls, and prunes irrelevant context with 94% accuracy, keeping a lean 32k token...

An Implementation of IWE’s Context Bridge as an AI-Powered Knowledge Graph with Agentic RAG, OpenAI Function Calling, and Graph Traversal
The tutorial demonstrates how to deploy IWE, an open‑source Rust‑based personal knowledge‑management system, to turn markdown files into a navigable directed graph. It walks through core CLI operations—find, retrieve, tree, squash, stats, and DOT export—using an eight‑note developer knowledge base....

Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning
Tencent AI Lab unveiled Covo‑Audio, a 7‑billion‑parameter Large Audio Language Model that processes continuous speech and generates high‑fidelity audio within a single architecture. The system combines Whisper‑large‑v3, Qwen2.5‑7B‑Base, and a WavLM‑based tokenizer, employing hierarchical tri‑modal interleaving and an intelligence‑speaker decoupling...

NVIDIA AI Introduces PivotRL: A New AI Framework Achieving High Agentic Accuracy With 4x Fewer Rollout Turns Efficiently
NVIDIA researchers unveiled PivotRL, a new framework that blends the data efficiency of supervised fine‑tuning with the generalization strength of end‑to‑end reinforcement learning for long‑horizon agentic tasks. By filtering high‑variance “pivot” turns and employing functional rewards, the method focuses compute...

Implementing Deep Q-Learning (DQN) From Scratch Using RLax JAX Haiku and Optax to Train a CartPole Reinforcement Learning Agent
The article walks through building a Deep Q‑Learning (DQN) agent from scratch using RLax together with JAX, Haiku, and Optax. It details the creation of a Q‑network, experience replay buffer, epsilon‑greedy policy, and training loop that computes TD errors with...

Meet GitAgent: The Docker for AI Agents that Is Finally Solving the Fragmentation Between LangChain, AutoGen, and Claude Code
GitAgent is an open‑source CLI that introduces a universal, Git‑backed format for AI agents, separating their definition from any specific orchestration framework. By storing agent metadata, personality, duties, skills, tools, rules, and memory as structured files in a repository, developers...

A Coding Implementation for Building and Analyzing Crystal Structures Using Pymatgen for Symmetry Analysis, Phase Diagrams, Surface Generation, and Materials...
The tutorial demonstrates how the open‑source pymatgen library can be used to construct, manipulate, and analyze crystal structures such as silicon, NaCl, and LiFePO₄‑like materials. It walks through lattice inspection, symmetry detection, coordination environment analysis, oxidation‑state decoration, supercell creation, surface...

A Coding Implementation Showcasing ClawTeam’s Multi-Agent Swarm Orchestration with OpenAI Function Calling
The tutorial demonstrates how ClawTeam’s open‑source multi‑agent swarm framework can be run entirely in Google Colab using OpenAI’s function‑calling API. It builds a leader agent that breaks a high‑level goal into sub‑tasks, and worker agents that execute those tasks via...

LlamaIndex Releases LiteParse: A CLI and TypeScript-Native Library for Spatial PDF Parsing in AI Agent Workflows
LlamaIndex unveiled LiteParse, an open‑source, TypeScript‑native library that parses PDFs locally for Retrieval‑Augmented Generation workflows. Built on PDF.js and Tesseract.js, it extracts spatially‑preserved text instead of converting to Markdown, keeping original layout and table structures intact. The tool also outputs...

A Coding Guide to Implement Advanced Differential Equation Solvers, Stochastic Simulations, and Neural Ordinary Differential Equations Using Diffrax and JAX
The tutorial demonstrates how to use the Diffrax library together with JAX, Equinox, and Optax to solve ordinary and stochastic differential equations, perform dense interpolation, and train neural ordinary differential equation (Neural ODE) models. It walks through logistic growth, Lotka‑Volterra,...

Unsloth AI Releases Unsloth Studio: A Local No-Code Interface For High-Performance LLM Fine-Tuning With 70% Less VRAM Usage
Unsloth AI launched Unsloth Studio, an open‑source, no‑code local web UI that simplifies LLM fine‑tuning. Built on hand‑written Triton back‑propagation kernels, the platform delivers up to twice the training speed and cuts VRAM consumption by roughly 70 % without sacrificing accuracy....

Google AI Releases WAXAL: A Multilingual African Speech Dataset for Training Automatic Speech Recognition and Text-to-Speech Models
Google AI has released WAXAL, an open multilingual speech dataset targeting 24 African languages. The dataset is split into an ASR component built from image‑prompted, natural‑environment recordings and a TTS component consisting of studio‑quality, single‑speaker audio. Only about 10% of...

How to Build High-Performance GPU-Accelerated Simulations and Differentiable Physics Workflows Using NVIDIA Warp Kernels
The MarkTechPost tutorial demonstrates how NVIDIA Warp lets Python developers write GPU‑accelerated kernels for scientific simulations and differentiable physics. It walks through environment setup, kernel creation for vector math, signed‑distance fields, particle dynamics, and a gradient‑based projectile optimizer. Performance tests show...

Mistral AI Releases Mistral Small 4: A 119B-Parameter MoE Model that Unifies Instruct, Reasoning, and Multimodal Workloads
Mistral AI unveiled Small 4, a 119‑billion‑parameter mixture‑of‑experts model that merges instruction following, reasoning, multimodal understanding, and agentic coding into a single deployment. The architecture features 128 experts with four active per token, delivering 6 B active parameters and a 256k context...

Moonshot AI Releases 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔 to Replace Fixed Residual Mixing with Depth-Wise Attention for Better Scaling in Transformers
Moonshot AI introduced Attention Residuals (AttnRes), a drop‑in replacement for standard residual connections in PreNorm Transformers that applies softmax attention over previous layer outputs. The approach includes a full version that attends to all earlier layers and a Block AttnRes...

Meet OpenViking: An Open-Source Context Database that Brings Filesystem-Based Memory and Retrieval to AI Agent Systems Like OpenClaw
OpenViking is an open‑source context database from Volcengine that reimagines AI‑agent memory as a virtual filesystem. It maps resources, user data, and agent skills into hierarchical directories accessed via a `viking://` protocol, replacing flat chunk‑based RAG with directory‑aware retrieval. The...

Zhipu AI Introduces GLM-OCR: A 0.9B Multimodal OCR Model for Document Parsing and Key Information Extraction (KIE)
Zhipu AI and Tsinghua University unveiled GLM‑OCR, a 0.9 billion‑parameter multimodal model designed for efficient document understanding. The architecture pairs a 0.4 B CogViT visual encoder with a 0.5 B GLM language decoder and introduces Multi‑Token Prediction to boost decoding speed by roughly...

Stanford Researchers Release OpenJarvis: A Local-First Framework for Building On-Device Personal AI Agents with Tools, Memory, and Learning
Stanford’s Scaling Intelligence Lab released OpenJarvis, an open‑source framework for building personal AI agents that run entirely on a user’s device. The platform introduces a five‑primitives architecture—Intelligence, Engine, Agents, Tools & Memory, and Learning—to modularize model selection, inference runtime, agent behavior, tool...

How to Build a Self-Designing Meta-Agent That Automatically Constructs, Instantiates, and Refines Task-Specific AI Agents
Developers can now use a meta‑agent framework that automatically designs, configures, and runs AI agents from a simple task description. The system parses the task, selects appropriate tools, memory architecture, and a ReAct‑style planner, then instantiates a fully functional runtime...

ByteDance Releases DeerFlow 2.0: An Open-Source SuperAgent Harness that Orchestrates Sub-Agents, Memory, and Sandboxes to Do Complex Tasks
ByteDance has open‑sourced DeerFlow 2.0, a SuperAgent framework that runs tasks inside isolated Docker containers rather than merely generating text. The system decomposes complex prompts into parallel sub‑agents, each operating in its own sandbox, and then converges results into a...

The ‘Bayesian’ Upgrade: Why Google AI’s New Teaching Method Is the Key to LLM Reasoning
Google researchers propose Bayesian Teaching, a method that trains large language models (LLMs) to emulate a Bayesian assistant’s belief‑updating process rather than simply memorizing correct answers. By fine‑tuning on synthetic flight‑booking interactions, the approach forces LLMs to reason under uncertainty,...

A Coding Guide to Build a Complete Single Cell RNA Sequencing Analysis Pipeline Using Scanpy for Clustering Visualization and Cell...
The article presents a step‑by‑step Python tutorial that builds a full single‑cell RNA‑sequencing (scRNA‑seq) analysis pipeline using Scanpy. It walks through data loading, quality‑control filtering, normalization, highly variable gene selection, PCA, neighbor‑graph construction, UMAP embedding, Leiden clustering, and marker‑gene based...

Building Next-Gen Agentic AI: A Complete Framework for Cognitive Blueprint Driven Runtime Agents with Memory Tools and Validation
The tutorial introduces a full cognitive blueprint framework that structures agent identity, goals, planning, memory, validation, and tool access. It demonstrates how YAML‑based blueprints can instantiate distinct agent personalities, such as a research bot and a data analyst bot, without...

Google Launches TensorFlow 2.21 And LiteRT: Faster GPU Performance, New NPU Acceleration, And Seamless PyTorch Edge Deployment Upgrades
Google released TensorFlow 2.21, promoting LiteRT from preview to a production‑ready on‑device inference framework that replaces TensorFlow Lite. LiteRT delivers a 1.4× speed boost on GPUs and introduces unified NPU acceleration for edge hardware. The update expands low‑precision operator support, adding...

OpenAI Introduces Codex Security in Research Preview for Context-Aware Vulnerability Detection, Validation, and Patch Generation Across Codebases
OpenAI has rolled out Codex Security, an application security agent, in research preview for ChatGPT Enterprise, Business, and Edu customers via Codex web. The tool builds a project‑specific threat model, validates vulnerabilities in sandboxed environments, and generates context‑aware patches. In...

Liquid AI Releases LocalCowork Powered By LFM2-24B-A2B to Execute Privacy-First Agent Workflows Locally Via Model Context Protocol (MCP)
Liquid AI unveiled LFM2-24B-A2B, a 24‑billion‑parameter sparse Mixture‑of‑Experts model, paired with the open‑source LocalCowork desktop agent. The architecture activates only about 2 billion parameters per token, fitting into a ~14.5 GB RAM footprint on an Apple M4 Max. LocalCowork runs entirely offline,...

OpenAI Releases Symphony: An Open Source Agentic Framework for Orchestrating Autonomous AI Agents Through Structured, Scalable Implementation Runs
OpenAI unveiled Symphony, an open‑source framework that orchestrates autonomous AI coding agents through structured implementation runs. Built on Elixir and the Erlang/BEAM runtime, it leverages fault‑tolerant concurrency to manage hundreds of isolated tasks. The system polls issue trackers such as...

YuanLab AI Releases Yuan 3.0 Ultra: A Flagship Multimodal MoE Foundation Model, Built for Stronger Intelligence and Unrivaled Efficiency
YuanLab AI unveiled Yuan 3.0 Ultra, a trillion‑parameter mixture‑of‑experts (MoE) foundation model that activates only 68.8 billion parameters. The model introduces Layer‑Adaptive Expert Pruning (LAEP), which removes underused experts during pre‑training, shrinking total parameters by 33.3% while preserving performance. Combined with an Expert...

Meet SymTorch: A PyTorch Library that Translates Deep Learning Models Into Human-Readable Equations
Researchers at the University of Cambridge introduced SymTorch, a PyTorch library that embeds symbolic regression into deep‑learning pipelines. The tool wraps any nn.Module, records activations, and uses PySR to distill closed‑form equations that replace neural components. In a proof‑of‑concept on...

A Coding Guide to Build a Scalable End-to-End Analytics and Machine Learning Pipeline on Millions of Rows Using Vaex
The MarkTechPost tutorial walks through building a production‑style analytics and machine‑learning pipeline with Vaex on a synthetic 2 million‑row dataset. It showcases lazy feature engineering, approximate city‑level aggregations, and seamless integration with scikit‑learn via Vaex‑ML. The guide also demonstrates model training,...

FireRedTeam Releases FireRed-OCR-2B Utilizing GRPO to Solve Structural Hallucinations in Tables and LaTeX for Software Developers
FireRedTeam unveiled FireRed-OCR-2B, a 2‑billion‑parameter vision‑language model that treats document parsing as a structural engineering problem rather than pure text generation. Leveraging the Qwen3‑VL‑2B‑Instruct backbone, the model introduces a three‑stage progressive training pipeline capped by Format‑Constrained GRPO to enforce syntactic...

How to Build an Explainable AI Analysis Pipeline Using SHAP-IQ to Understand Feature Importance, Interaction Effects, and Model Decision Breakdown
The tutorial demonstrates how to construct a full explainable‑AI pipeline using the SHAP‑IQ library to extract both feature importance and pairwise interaction effects from a Random Forest model trained on the California housing dataset. It walks through environment setup, utility...

Google AI Introduces STATIC: A Sparse Matrix Framework Delivering 948x Faster Constrained Decoding for LLM Based Generative Retrieval
Google DeepMind and YouTube researchers unveiled STATIC, a Sparse Transition Matrix‑Accelerated Trie Index that converts prefix‑tree constraints into a static CSR matrix for vectorized sparse operations. The framework achieves 0.033 ms per decoding step, delivering a 948× speedup over CPU‑offloaded tries...

A Complete End-to-End Coding Guide to MLflow Experiment Tracking, Hyperparameter Optimization, Model Evaluation, and Live Model Deployment
The article presents a step‑by‑step tutorial that builds a production‑grade MLflow workflow, covering tracking server setup, nested hyperparameter sweeps, automatic logging, model evaluation, and live REST‑API serving. It demonstrates how to configure a SQLite backend, use MLflow autologging for scikit‑learn...

Google DeepMind Introduces Unified Latents (UL): A Machine Learning Framework that Jointly Regularizes Latents Using a Diffusion Prior and Decoder
Google DeepMind unveiled Unified Latents (UL), a new framework that jointly trains an encoder, diffusion prior, and diffusion decoder to regularize latent representations. By using a deterministic encoder with fixed Gaussian noise and a reweighted decoder ELBO, UL bridges the...

Sakana AI Introduces Doc-to-LoRA and Text-to-LoRA: Hypernetworks that Instantly Internalize Long Contexts and Adapt LLMs via Zero-Shot Natural Language
Sakana AI unveiled two hypernetwork‑based methods—Text‑to‑LoRA (T2L) and Doc‑to‑LoRA (D2L)—that generate low‑rank adaptation matrices for large language models in a single forward pass. T2L creates task‑specific LoRA adapters from plain‑language descriptions, while D2L compresses entire documents into parameter updates, eliminating...

Perplexity Just Released Pplx-Embed: New SOTA Qwen3 Bidirectional Embedding Models for Web-Scale Retrieval Tasks
Perplexity unveiled pplx-embed, a pair of multilingual embedding models built on Qwen3 with bidirectional attention and diffusion‑based pretraining. The 0.6 B and 4 B variants are engineered for web‑scale retrieval, offering native INT8 quantization and Matryoshka representation learning. Two specialized versions—pplx‑embed‑v1 for...

Microsoft Research Introduces CORPGEN To Manage Multi Horizon Tasks For Autonomous AI Agents Using Hierarchical Planning and Memory
Microsoft Research unveiled CORPGEN, an architecture‑agnostic framework that equips autonomous AI agents to operate in Multi‑Horizon Task Environments (MHTEs) where dozens of interleaved, dependent tasks coexist. The paper identifies four failure modes—context saturation, memory interference, dependency‑graph complexity, and reprioritization overhead—that...

Nous Research Releases ‘Hermes Agent’ to Fix AI Forgetfulness with Multi-Level Memory and Dedicated Remote Terminal Access Support
Nous Research unveiled Hermes Agent, an open‑source autonomous system built on the Hermes‑3 Llama 3.1‑based model. It introduces a multi‑level memory hierarchy that records successful workflows as searchable Skill Documents, giving the agent procedural recall across sessions. The platform also provides...

Tailscale and LM Studio Introduce ‘LM Link’ to Provide Encrypted Point-to-Point Access to Your Private GPU Hardware Assets
LM Studio and Tailscale have launched LM Link, a feature that lets developers access remote GPU rigs as if they were locally attached. The solution replaces public APIs and SSH tunnels with a private, WireGuard‑encrypted tunnel built on Tailscale’s userspace tsnet...

How to Build an Elastic Vector Database with Consistent Hashing, Sharding, and Live Ring Visualization for RAG Systems
The tutorial walks through building an elastic vector‑database simulator that uses consistent hashing with virtual nodes to shard embeddings across distributed storage. It includes a live, interactive ring visualization that shows how adding or removing nodes only reshuffles a tiny...

New ETH Zurich Study Proves Your AI Coding Agents Are Failing Because Your AGENTS.md Files Are Too Detailed
A new ETH Zurich study reveals that overly detailed AGENTS.md files degrade AI coding agent performance and raise inference costs. Experiments with models such as Sonnet-4.5, GPT-5.2, and Qwen3-30B showed auto‑generated context reduces success rates by about 3%, while human‑crafted...

Meta AI Open Sources GCM for Better GPU Cluster Monitoring to Ensure High Performance AI Training and Hardware Reliability
Meta AI Research has open‑sourced GCM, a GPU Cluster Monitoring toolkit designed to catch silent hardware failures that can derail large‑scale AI training. The system integrates tightly with the Slurm workload manager, providing job‑level attribution of power, temperature, and error...