MarkTechPost

MarkTechPost

Publication
0 followers

Showcases the hottest research trends in AI from around the world

Meet SETA: Open Source Training Reinforcement Learning Environments for Terminal Agents with 400 Tasks and CAMEL Toolkit
NewsJan 11, 2026

Meet SETA: Open Source Training Reinforcement Learning Environments for Terminal Agents with 400 Tasks and CAMEL Toolkit

Researchers from CAMEL AI, Eigent AI and partners released SETA, an open‑source stack that couples a terminal‑focused toolkit with 400 synthetic reinforcement‑learning tasks. The framework delivers state‑of‑the‑art results on the Terminal Bench benchmark, hitting 46.5% accuracy on version 2.0 with a...

By MarkTechPost
How to Build Portable, In-Database Feature Engineering Pipelines with Ibis Using Lazy Python APIs and DuckDB Execution
NewsJan 9, 2026

How to Build Portable, In-Database Feature Engineering Pipelines with Ibis Using Lazy Python APIs and DuckDB Execution

The tutorial shows how Ibis can create a portable, in‑database feature‑engineering pipeline that feels like Pandas but runs entirely in DuckDB. By registering data in the backend and keeping all transformations lazy, the code is translated into efficient SQL without...

By MarkTechPost
Stanford Researchers Build SleepFM Clinical: A Multimodal Sleep Foundation AI Model for 130+ Disease Prediction
NewsJan 8, 2026

Stanford Researchers Build SleepFM Clinical: A Multimodal Sleep Foundation AI Model for 130+ Disease Prediction

Stanford Medicine researchers unveiled SleepFM Clinical, a multimodal foundation model trained on 585,000 hours of polysomnography from about 65,000 individuals. The model learns a unified representation of brain, heart, and respiratory signals and can predict long‑term risk for more than...

By MarkTechPost
A Coding Implementation to Build a Unified Apache Beam Pipeline Demonstrating Batch and Stream Processing with Event-Time Windowing Using DirectRunner
NewsJan 7, 2026

A Coding Implementation to Build a Unified Apache Beam Pipeline Demonstrating Batch and Stream Processing with Event-Time Windowing Using DirectRunner

The tutorial shows how to build a unified Apache Beam pipeline that can run in both batch and stream‑like modes using the DirectRunner. It creates synthetic event‑time data, applies fixed windows with triggers and allowed lateness, and demonstrates how Beam...

By MarkTechPost
Liquid AI Releases LFM2.5: A Compact AI Model Family For Real On Device Agents
NewsJan 6, 2026

Liquid AI Releases LFM2.5: A Compact AI Model Family For Real On Device Agents

Liquid AI unveiled LFM2.5, a compact 1.2 billion‑parameter model family designed for on‑device and edge inference. The suite includes Base, Instruct, Japanese, vision‑language, and audio variants, all released with open weights on Hugging Face and via the LEAP platform. Pre‑training was...

By MarkTechPost
LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression
NewsJan 5, 2026

LLM-Pruning Collection: A JAX Based Repo For Structured And Unstructured LLM Compression

Zlab Princeton has open‑sourced the LLM‑Pruning Collection, a JAX‑based repository that aggregates leading pruning techniques for large language models. The repo bundles block‑level, layer‑level, and weight‑level methods—including Minitron, ShortGPT, Wanda, SparseGPT, Magnitude, Sheared LLaMA and LLM‑Pruner—under a unified training and evaluation...

By MarkTechPost
Tencent Researchers Release Tencent HY-MT1.5: A New Translation Models Featuring 1.8B and 7B Models Designed for Seamless On-Device and Cloud...
NewsJan 5, 2026

Tencent Researchers Release Tencent HY-MT1.5: A New Translation Models Featuring 1.8B and 7B Models Designed for Seamless On-Device and Cloud...

Tencent Hunyuan researchers unveiled HY-MT1.5, a bilingual translation family comprising a 1.8 B and a 7 B model. Both models cover 33 languages plus five dialect variants and are released with open weights on GitHub and Hugging Face. The compact 1.8 B variant runs...

By MarkTechPost
AI Interview Series #5: Prompt Caching
NewsJan 5, 2026

AI Interview Series #5: Prompt Caching

Prompt caching reduces LLM API costs by reusing static prompt components. By storing key‑value attention states in GPU memory, identical prefixes avoid recomputation, cutting latency and token usage. Engineers can boost efficiency by analyzing request patterns, restructuring prompts so shared...

By MarkTechPost
A Coding Implementation to Build a Self-Testing Agentic AI System Using Strands to Red-Team Tool-Using Agents and Enforce Safety at...
NewsJan 2, 2026

A Coding Implementation to Build a Self-Testing Agentic AI System Using Strands to Red-Team Tool-Using Agents and Enforce Safety at...

The tutorial builds a red‑team evaluation harness with Strands Agents to stress‑test a tool‑using AI assistant against prompt‑injection and tool‑misuse attacks. It defines a guarded target agent, a red‑team agent that auto‑generates adversarial prompts, and a judge agent that scores...

By MarkTechPost
Tencent Released Tencent HY-Motion 1.0: A Billion-Parameter Text-to-Motion Model Built on the Diffusion Transformer (DiT) Architecture and Flow Matching
NewsDec 31, 2025

Tencent Released Tencent HY-Motion 1.0: A Billion-Parameter Text-to-Motion Model Built on the Diffusion Transformer (DiT) Architecture and Flow Matching

Tencent Hunyuan’s 3D Digital Human team launched HY‑Motion 1.0, an open‑weight text‑to‑3D human motion model built on a Diffusion Transformer (DiT) architecture and trained with Flow Matching. The flagship model contains 1 billion parameters, with a Lite 0.46 billion variant, and generates SMPL‑H...

By MarkTechPost
A Coding Implementation of an OpenAI-Assisted Privacy-Preserving Federated Fraud Detection System From Scratch Using Lightweight PyTorch Simulations
NewsDec 30, 2025

A Coding Implementation of an OpenAI-Assisted Privacy-Preserving Federated Fraud Detection System From Scratch Using Lightweight PyTorch Simulations

The tutorial walks through building a privacy‑preserving federated fraud‑detection system from scratch using lightweight, CPU‑only PyTorch. It simulates ten independent banks, partitions highly imbalanced transaction data with a Dirichlet distribution, and coordinates local model updates via a FedAvg loop. After...

By MarkTechPost
Meet LLMRouter: An Intelligent Routing System Designed to Optimize LLM Inference by Dynamically Selecting the Most Suitable Model for Each Query
NewsDec 30, 2025

Meet LLMRouter: An Intelligent Routing System Designed to Optimize LLM Inference by Dynamically Selecting the Most Suitable Model for Each Query

LLMRouter, an open‑source library from UIUC, sits between applications and heterogeneous LLM pools to automatically select the most appropriate model per query. It offers over 16 routing algorithms organized into single‑round, multi‑round, personalized, and agentic families, each configurable via a...

By MarkTechPost
NVIDIA AI Researchers Release NitroGen: An Open Vision Action Foundation Model For Generalist Gaming Agents
NewsDec 28, 2025

NVIDIA AI Researchers Release NitroGen: An Open Vision Action Foundation Model For Generalist Gaming Agents

NVIDIA’s AI team unveiled NitroGen, an open‑source vision‑action foundation model that learns to play commercial games directly from pixel inputs and gamepad actions. The model is trained on 40,000 hours of filtered gameplay video spanning over 1,000 titles, using automatic...

By MarkTechPost
Liquid AI’s LFM2-2.6B-Exp Uses Pure Reinforcement Learning RL And Dynamic Hybrid Reasoning To Tighten Small Model Behavior
NewsDec 28, 2025

Liquid AI’s LFM2-2.6B-Exp Uses Pure Reinforcement Learning RL And Dynamic Hybrid Reasoning To Tighten Small Model Behavior

Liquid AI released LFM2-2.6B-Exp, an experimental checkpoint that adds a pure reinforcement‑learning (RL) stage to its 2.6 billion‑parameter LFM2 model. The RL fine‑tuning targets instruction following, knowledge retrieval, and math without altering the hybrid convolution‑attention architecture. Benchmark results show the model...

By MarkTechPost
How to Build Production-Grade Agentic Workflows with GraphBit Using Deterministic Tools, Validated Execution Graphs, and Optional LLM Orchestration
NewsDec 27, 2025

How to Build Production-Grade Agentic Workflows with GraphBit Using Deterministic Tools, Validated Execution Graphs, and Optional LLM Orchestration

The tutorial demonstrates how to build a production‑grade, agentic workflow for customer‑support ticket triage using GraphBit. It starts by configuring the GraphBit runtime, defining typed ticket data, and registering deterministic tools for classification, routing, and response drafting. These tools are...

By MarkTechPost
A Coding Implementation on Building Self-Organizing Zettelkasten Knowledge Graphs and Sleep-Consolidation Mechanisms
NewsDec 26, 2025

A Coding Implementation on Building Self-Organizing Zettelkasten Knowledge Graphs and Sleep-Consolidation Mechanisms

The tutorial by Asif Razzaq demonstrates how to build a self‑organizing Zettelkasten memory system for agentic AI using Google Gemini. It defines a MemoryNode data class, ingests text by atomizing it into discrete facts, embeds each fact, and links semantically...

By MarkTechPost
A Coding Guide to Build an Autonomous Multi-Agent Logistics System with Route Planning, Dynamic Auctions, and Real-Time Visualization Using Graph-Based...
NewsDec 25, 2025

A Coding Guide to Build an Autonomous Multi-Agent Logistics System with Route Planning, Dynamic Auctions, and Real-Time Visualization Using Graph-Based...

The tutorial walks readers through building a fully autonomous, multi‑agent logistics simulation where five smart trucks navigate a 30‑node graph‑based city. Each truck acts as an agent that bids on delivery orders, plans shortest‑path routes, monitors battery levels, and seeks...

By MarkTechPost
This AI Paper From Stanford and Harvard Explains Why Most ‘Agentic AI’ Systems Feel Impressive in Demos and Then Completely...
NewsDec 24, 2025

This AI Paper From Stanford and Harvard Explains Why Most ‘Agentic AI’ Systems Feel Impressive in Demos and Then Completely...

The Stanford‑Harvard paper "Adaptation of Agentic AI" introduces a unified, mathematically grounded framework for tuning agentic AI systems that combine large language models with planning, tool‑use, and memory modules. It categorizes adaptation into four paradigms based on whether the target...

By MarkTechPost
InstaDeep Introduces Nucleotide Transformer V3 (NTv3): A New Multi-Species Genomics Foundation Model, Designed for 1 Mb Context Lengths at Single-Nucleotide...
NewsDec 24, 2025

InstaDeep Introduces Nucleotide Transformer V3 (NTv3): A New Multi-Species Genomics Foundation Model, Designed for 1 Mb Context Lengths at Single-Nucleotide...

InstaDeep unveiled Nucleotide Transformer v3 (NTv3), a multi‑species genomics foundation model that processes 1 megabase windows at single‑nucleotide resolution. The U‑Net‑style architecture combines down‑sampling, transformer layers, and up‑sampling to deliver both prediction and controllable sequence generation. NTv3 was pre‑trained on 9 trillion base...

By MarkTechPost
Google Health AI Releases MedASR: A Conformer Based Medical Speech to Text Model for Clinical Dictation
NewsDec 24, 2025

Google Health AI Releases MedASR: A Conformer Based Medical Speech to Text Model for Clinical Dictation

Google Health AI released MedASR, an open‑weights medical speech‑to‑text model built on the Conformer architecture. The 105‑million‑parameter model is trained on roughly 5,000 hours of de‑identified physician dictations and clinical conversations across radiology, internal and family medicine. Benchmarks show MedASR...

By MarkTechPost
How to Build a Proactive Pre-Emptive Churn Prevention Agent with Intelligent Observation and Strategy Formation
NewsDec 23, 2025

How to Build a Proactive Pre-Emptive Churn Prevention Agent with Intelligent Observation and Strategy Formation

The tutorial walks through building a Pre‑Emptive Churn Prevention Agent that automatically spots inactive users, evaluates their churn risk, and drafts personalized re‑engagement emails using Google Gemini. It creates a mock customer database, defines a risk‑analysis prompt, and generates incentive‑driven...

By MarkTechPost
Google DeepMind Researchers Release Gemma Scope 2 as a Full Stack Interpretability Suite for Gemma 3 Models
NewsDec 23, 2025

Google DeepMind Researchers Release Gemma Scope 2 as a Full Stack Interpretability Suite for Gemma 3 Models

Google DeepMind has unveiled Gemma Scope 2, an open‑source interpretability suite that spans the entire Gemma 3 family from 270 M to 27 B parameters. The platform leverages layer‑wise sparse autoencoders and transcoders, trained on roughly 110 petabytes of activation data and over a trillion...

By MarkTechPost
How to Build a Fully Autonomous Local Fleet-Maintenance Analysis Agent Using SmolAgents and Qwen Model
NewsDec 22, 2025

How to Build a Fully Autonomous Local Fleet-Maintenance Analysis Agent Using SmolAgents and Qwen Model

The tutorial demonstrates how to build a fully autonomous fleet‑maintenance analysis agent using SmolAgents and a locally hosted Qwen 2.5‑7B‑Instruct model. By defining a custom tool that loads telemetry CSV files, the agent can reason step‑by‑step, flag trucks with high engine...

By MarkTechPost
Anthropic AI Releases Bloom: An Open-Source Agentic Framework for Automated Behavioral Evaluations of Frontier AI Models
NewsDec 21, 2025

Anthropic AI Releases Bloom: An Open-Source Agentic Framework for Automated Behavioral Evaluations of Frontier AI Models

Anthropic has open‑sourced Bloom, an agentic framework that automates behavioral evaluations of frontier AI models. Researchers provide a seed file describing a target behavior, and Bloom generates dozens to hundreds of diverse, reproducible scenarios through a four‑stage pipeline of understanding,...

By MarkTechPost
AI Interview Series #4: Explain KV Caching
NewsDec 21, 2025

AI Interview Series #4: Explain KV Caching

KV caching is an inference optimization that stores the keys and values from previous attention steps during autoregressive generation. By reusing these cached tensors, the model only computes queries for new tokens, avoiding redundant attention over the entire prompt. Benchmarks...

By MarkTechPost
Unsloth AI and NVIDIA Are Revolutionizing Local LLM Fine-Tuning: From RTX Desktops to DGX Spark
NewsDec 19, 2025

Unsloth AI and NVIDIA Are Revolutionizing Local LLM Fine-Tuning: From RTX Desktops to DGX Spark

Unsloth, a GPU‑optimized fine‑tuning library, now runs on everything from GeForce RTX desktops to NVIDIA's DGX Spark, delivering up to 2.5× faster training for LLMs. The tool supports parameter‑efficient methods like LoRA/QLoRA, full model updates, and reinforcement‑learning pipelines, each with clear...

By MarkTechPost
How to Orchestrate a Fully Autonomous Multi-Agent Research and Writing Pipeline Using CrewAI and Gemini for Real-Time Intelligent Collaboration
NewsDec 17, 2025

How to Orchestrate a Fully Autonomous Multi-Agent Research and Writing Pipeline Using CrewAI and Gemini for Real-Time Intelligent Collaboration

The tutorial demonstrates how to build a two‑agent CrewAI pipeline that leverages the Gemini Flash model for real‑time research and writing. It walks through environment setup, secure Gemini authentication, and the definition of a researcher and a writer agent with...

By MarkTechPost
Thinking Machines Lab Makes Tinker Generally Available: Adds Kimi K2 Thinking And Qwen3-VL Vision Input
NewsDec 17, 2025

Thinking Machines Lab Makes Tinker Generally Available: Adds Kimi K2 Thinking And Qwen3-VL Vision Input

Thinking Machines Lab has moved its Tinker training API to general availability, removing the waitlist and opening access to all developers. The update adds three major capabilities: support for the 1‑trillion‑parameter Kimi K2 Thinking reasoning model, an OpenAI‑compatible sampling interface, and multimodal...

By MarkTechPost
How to Design a Gemini-Powered Self-Correcting Multi-Agent AI System with Semantic Routing, Symbolic Guardrails, and Reflexive Orchestration
NewsDec 15, 2025

How to Design a Gemini-Powered Self-Correcting Multi-Agent AI System with Semantic Routing, Symbolic Guardrails, and Reflexive Orchestration

The article presents a step‑by‑step tutorial for building a Gemini‑powered multi‑agent AI system that uses semantic routing, symbolic guardrails, and a self‑correction loop. It defines a shared AgentMessage format, a CognitiveEngine that calls Gemini‑2.0‑Flash, and a SemanticRouter that maps user...

By MarkTechPost
How to Design a Fully Local Agentic Storytelling Pipeline Using Griptape Workflows, Hugging Face Models, and Modular Creative Task Orchestration
NewsDec 12, 2025

How to Design a Fully Local Agentic Storytelling Pipeline Using Griptape Workflows, Hugging Face Models, and Modular Creative Task Orchestration

The article walks through building a fully local, API‑free storytelling system using Griptape and a TinyLlama model from Hugging Face. It demonstrates an agent equipped with a calculator tool, hierarchical world‑generation and character tasks, and a final story‑writing task governed by...

By MarkTechPost
CopilotKit v1.50 Brings AG-UI Agents Directly Into Your App With the New useAgent Hook
NewsDec 11, 2025

CopilotKit v1.50 Brings AG-UI Agents Directly Into Your App With the New useAgent Hook

CopilotKit v1.50 rebuilds its frontend on the Agent User Interaction (AG‑UI) protocol and ships a new React hook, useAgent, that turns agent‑to‑UI communication into a single typed event stream. The hook subscribes to messages, streaming tokens, tool calls and shared state,...

By MarkTechPost