AI Videos

All News Deals Social Blogs Videos Podcasts Digests

Build Hour: Agent RFT

•November 10, 2025

OpenAI

OpenAI•Nov 10, 2025

Why It Matters

Agent RFT lets businesses fine‑tune autonomous AI agents to use their own tools efficiently, cutting latency and inference costs while delivering higher task accuracy—key advantages for scaling AI‑driven workflows.

Summary

In a recent "Build Hour" webcast, OpenAI’s startup marketing lead Christine, alongside engineer Will and solutions architect Theo, introduced Agent Reinforcement Fine‑Tuning (Agent RFT), a new capability that lets developers fine‑tune autonomous agents by rewarding desired tool‑use behavior during training. The session built on earlier tutorials about agents, the responses API and Agent Kit, and positioned RFT as the next logical step for teams that have already optimized prompts, task definitions and tool descriptions but still need higher performance.

The presenters explained that agents differ from standard language models by interacting with external tools—code interpreters, databases, browsers, etc.—and that every tool call is fed back into the model’s context window. While prompt engineering and task simplification can yield modest gains, Agent RFT modifies the model’s weights using a custom reward signal, allowing the agent to explore many tool‑calling strategies and converge on the most efficient ones. The product now supports live tool calls during training, a reward‑endpoint API, and a lightweight penalty on reasoning tokens to curb unnecessary calls, delivering sample‑efficient learning and lower latency.

A concrete example featured a partnership with Cognition and a demo on a hardened financial‑QA benchmark. The team gave the agent access to a semantic search tool, a directory‑listing tool, and a "cat" tool to retrieve documents from a corpus of 2,800 reports, imposing a ten‑tool‑call budget. Using a model grader that awards partial credit for near‑correct answers, the fine‑tuned agent learned to locate the right report, extract numeric data, and answer within the call limit, illustrating how RFT can improve both accuracy and speed. Theo highlighted the ability to tag each tool call with a rollout ID, enabling customers to track state and apply bespoke grading logic in their own environments.

The rollout signals that enterprises can now train agents that are tightly aligned with proprietary toolsets and latency constraints, reducing inference costs while boosting task‑specific performance. By exposing the training loop to real‑world endpoints, OpenAI gives developers the flexibility to define business‑critical reward functions, paving the way for more reliable, production‑ready autonomous agents across sectors such as code assistance, customer service and financial analysis.

Original Description

Agent RFT enables reasoning models to become even more powerful, tool-using agents by training directly on the workflows they will execute in production. By operating on agent rollouts, reasoning models can call tools, generate intermediate reasoning steps, and receive real-time feedback via customer-provided endpoints. This Build Hour will walk through the preparation, infrastructure, and safety oversight to use Agentic RFT.

Theophile Sautory (Applied AI) and William Hang (API Engineering) cover:

• Improving agent performance with optimization and fine-tuning options

• Key differences between Base RFT and Agentic RFT

• New additions and how Agent RFT works

• Task setup and live demos training with tools

• Customer spotlight on Cognition with Sampriti Panda (Research Engineer)

• Success stories featuring Ambience, Genspark, Mako, and Rogo

• Live Q&A

👉 Agent RFT Interest Form: https://tinyurl.com/agentRFT

👉 Follow along with the code repo: https://github.com/openai/build-hours

👉 Sign up for upcoming live Build Hours: https://webinar.openai.com/buildhours/

00:00 Introduction

01:34 Intro to Agent RFT

11:12 Task Setup

14:15 Demos: Training with Tools

31:33 Best Practices

35:15 Customer Spotlight: Cognition

44:58 Success Stories

51:16 Summary

52:33 Q&A

Comments

Want to join the conversation?

Loading comments...