NVIDIA brings agents to life with DGX Spark and Reachy Mini

Today at CES 2026, NVIDIA unveiled a world of new open models to enable the future of agents, online and in the real world. From the recently released NVIDIA Nemotron reasoning LLMs to the new NVIDIA Isaac GR00T N1.6 open‑reasoning VLA and NVIDIA Cosmos world foundation models, all the building blocks are here today for AI Builders to build their own agents.

But what if you could bring your own agent to life, right at your desk? An AI buddy that can be useful to you and process your data privately?

In the CES keynote, Jensen Huang showed how to do exactly that, using the processing power of NVIDIA DGX Spark together with Reachy Mini to create a little office R2‑D2 you can talk to and collaborate with.

This guide walks through replicating that experience at home with a DGX Spark and Reachy Mini.

Ingredients

Reasoning model: NVIDIA Nemotron 3 Nano
Vision model: NVIDIA Nemotron Nano 2 VL
Text‑to‑speech model: ElevenLabs
Robot: Reachy Mini (or Reachy Mini simulation)
Python 3.10+ environment, with uv

You can adapt the recipe for:

Local deployment – run on your own hardware (DGX Spark or a GPU with sufficient VRAM). The reasoning model needs ~65 GB disk space; the vision model ~28 GB.
Cloud deployment – use NVIDIA Brev or Hugging Face Inference Endpoints.
Serverless model endpoints – call NVIDIA or Hugging Face inference providers.

Giving Agentic Powers to Reachy

Turning a chat‑only AI into a robot that can see, speak, and act makes interactions feel far more real. Reachy Mini is fully customizable: its sensors, actuators, and APIs let you wire it into any agent stack, whether in simulation or on real hardware.

This post focuses on composing existing building blocks rather than reinventing them. We combine open models for reasoning and vision, an agent framework for orchestration, and tool handlers for actions. Each component is loosely coupled, making it easy to swap models, change routing logic, or add new behaviours.

Building the Agent

We use the NVIDIA NeMo Agent Toolkit, a lightweight, framework‑agnostic library that connects all components of the agent. It works with other agentic frameworks (LangChain, LangGraph, CrewAI) and handles model interaction, routing, and profiling.

Step 0 – Set Up and Get Access to Models and Services


git clone git@github.com:brevdev/reachy-personal-assistant

cd reachy-personal-assistant

Create a .env file with your API keys (skip if running locally without remote endpoints):


NVIDIA_API_KEY=your_nvidia_api_key_here

ELEVENLABS_API_KEY=your_elevenlabs_api_key_here

Step 1 – Build a Chat Interface


cd nat

uv venv

uv sync

uv run --env-file ../.env nat serve --config_file src/ces_tutorial/config.yml --port 8001

Test the endpoint:


curl -s http://localhost:8001/v1/chat/completions \

  -H "Content-Type: application/json" \

  -d '{"model":"test","messages":[{"role":"user","content":"What is the capital of France?"}]}'

Step 2 – Add NeMo Agent Toolkit’s Built‑in ReAct Agent for Tool Calling


functions:

  wikipedia_search:

    _type: wiki_search

    max_results: 2

  react_agent:

    _type: react_agent

    llm_name: agent_llm

    verbose: true

    parse_agent_response_max_retries: 3

    tool_names: [wikipedia_search]



workflow:

  _type: ces_tutorial_router_agent

  agent: react_agent

Tips: keep tool schemas tight, cap the number of tool calls, and consider a “confirm before actuation” step for physical robots.

Step 3 – Add a Router to Direct Queries to Different Models


functions:

  router:

    _type: router

    route_config:

      - name: other

        description: Questions needing careful thought, external info, image understanding, or tool calls.

      - name: chit_chat

        description: Simple chit‑chat or casual conversation.

      - name: image_understanding

        description: Queries that require visual perception.

    llm_name: routing_llm



llms:

  routing_llm:

    _type: nim

    model_name: microsoft/phi-3-mini-128k-instruct

    temperature: 0.0

Note: you can self‑host the fast‑text model to reduce latency/cost while keeping the VLM remote.

Step 4 – Add a Pipecat Bot for Real‑Time Voice + Vision

Pipecat orchestrates audio/video streams, speech recognition, TTS, and robot actions. The bot code lives in reachy-personal-assistant/bot.

Step 5 – Hook Everything Up to Reachy (Hardware or Simulation)

Reachy Mini exposes a daemon. By default the repo runs the daemon in simulation (--sim). Omit the flag for a real robot.

Run the Full System

You need three terminals:

Terminal 1 – Reachy daemon


cd bot

# macOS

uv run mjpython -m reachy_mini.daemon.app.main --sim --no-localhost-only

# Linux

uv run -m reachy_mini.daemon.app.main --sim --no-localhost-only

Terminal 2 – Bot service


cd bot

uv venv

uv sync

uv run --env-file ../.env python main.py

Terminal 3 – NeMo Agent Toolkit service


cd nat

uv venv

uv sync

uv run --env-file ../.env nat serve --config_file src/ces_tutorial/config.yml --port 8001

Interacting with the System

Reachy Sim – appears automatically when the daemon starts (only for simulation).
Pipecat Playground – open http://localhost:7860/client/ in a browser, click CONNECT, grant microphone (and optionally camera) access.

When both windows show READY, the bot greets you with “Hello, how may I assist you today?” You can now talk to your personal assistant.

Example Prompts

Text‑only (fast text model)

“Explain what you can do in one sentence.”
“Summarize the last thing I said.”

Vision (VLM)

“What am I holding up to the camera?”
“Read the text on this page and summarize it.”

Where to Go Next

Performance optimisation: use the LLM Router example to balance cost, latency, and quality.
Voice‑powered RAG: follow the tutorial for building a RAG agent with guardrails using Nemotron models.
Hardware mastery: explore the Reachy Mini SDK and simulation docs to design advanced robotic behaviours.
Community apps: check out the Reachy Mini spaces built by the community.

Try it now: Deploy the full environment with one click — Launch the environment.

AI Blogs and Articles

Why It Matters