Key Takeaways
- •Nemotron 3 Super 120B model now free on Kilo
- •Benchmarks: MMLU 86.01, HumanEval 79.40, SWE‑Bench 60.5
- •Top open‑weight model on PinchBench with 84.7% average
- •Perfect scores on scaffolding, file ops, and data summarization
- •Weak on AI image generation, only 27% success
Summary
NVIDIA has launched the 120‑billion‑parameter Nemotron 3 Super, a hybrid mixture‑of‑experts model optimized for Blackwell GPUs, and made it freely available on the Kilo platform. Early benchmarks show strong results – 86.01 on MMLU, 79.40 on HumanEval, and 60.5 on SWE‑Bench – positioning it ahead of many open‑weight rivals. In Kilo’s PinchBench tests the model achieved an 84.7% average score, excelling at code scaffolding, file manipulation, and data summarization. The only notable weakness is its low performance on AI‑driven image generation tasks.
Pulse Analysis
NVIDIA’s Nemotron 3 Super marks a significant step in the evolution of large language models, combining a 120‑billion‑parameter backbone with a hybrid mixture‑of‑experts design. Optimized for the company’s Blackwell GPU architecture, the model delivers both raw scale and efficiency, a rare combination in today’s AI market. By offering the model for free through Kilo’s VS Code extension and KiloClaw, NVIDIA lowers the cost barrier for developers, encouraging experimentation and rapid integration into existing workflows. This strategy mirrors a broader industry shift toward open‑access, high‑performance models that can be fine‑tuned for specialized tasks.
Performance metrics underscore Nemotron 3 Super’s competitive edge. Independent benchmarks report an MMLU score of 86.01, HumanEval 79.40, and SWE‑Bench 60.5, outpacing many open‑weight alternatives and approaching the capabilities of proprietary giants like GPT‑4. In Kilo’s PinchBench suite, the model posted an 84.7% average, with a peak of 85.6%, demonstrating superior reasoning on code‑centric tasks. Its flawless results on project scaffolding, file manipulation, and CSV/Excel summarization highlight its suitability for developer‑focused automation, while a modest 27% success rate on AI image generation signals a clear area for improvement.
Strategically, the free rollout on Kilo could accelerate adoption across startups and enterprise teams seeking cost‑effective AI agents. By positioning Nemotron 3 Super as an open, high‑performing alternative, NVIDIA challenges the dominance of established closed models and pressures competitors to enhance openness and pricing. The model’s strengths in code and data tasks make it a compelling choice for DevOps, security reviews, and data analytics pipelines, though teams requiring robust visual generation may still look elsewhere. As the ecosystem evolves, continued benchmarking and community feedback will determine whether Nemotron 3 Super can sustain its early momentum and become a staple in the next generation of agentic AI solutions.


Comments
Want to join the conversation?