
FuriosaAI Partners with Broadcom on Third AI Inference Platform
Companies Mentioned
Why It Matters
By uniting cutting‑edge silicon with Broadcom’s networking infrastructure, the partnership could redefine data‑center AI inference efficiency, giving hyperscale operators a more power‑effective alternative to GPU‑centric clusters.
Key Takeaways
- •Furiosa's third‑gen chip uses 2nm compute die with HBM4
- •Broadcom provides Ethernet and PCIe packaging for chiplet integration
- •RNGD 5nm accelerator already in mass production validates architecture
- •Platform targets higher token density and performance‑per‑watt than GPUs
- •Sampling begins H1 2028 for next‑decade data‑center deployments
Pulse Analysis
The AI inference market is reaching a tipping point as large language models demand ever‑greater token throughput while data‑center power budgets tighten. FuriosaAI’s Tensor Contraction Processor (TCP) architecture, proven by its RNGD accelerator, offers a compelling alternative to traditional GPUs by focusing on data reuse and low‑latency compute. RNGD’s success in production at Samsung SDS and LG AI Research demonstrates that Furiosa can deliver high‑performance, air‑cooled solutions at 180 W, setting a solid foundation for the next generation of inference silicon.
The partnership with Broadcom amplifies that foundation by adding industry‑leading networking and packaging capabilities. A 2nm compute die paired with HBM4/4E memory will be assembled as a multi‑die chiplet, using Broadcom’s advanced Ethernet and PCIe technologies to enable high‑bandwidth, rack‑scale communication. This integration addresses the bottleneck of data movement across thousands of nodes, allowing the platform to sustain the token‑intensive workloads of frontier AI models while maintaining superior performance‑per‑watt.
For enterprises, the combined hardware‑software stack promises faster model deployment and easier adaptation to new AI advances. Furiosa’s SDK automatically translates PyTorch code to silicon, while its Virtual ISA offers granular control without the complexity of GPU programming. With sampling scheduled for early 2028, the solution is positioned to capture a growing segment of AI data‑center spend, challenging GPU incumbents and potentially reshaping the economics of large‑scale inference deployments.
FuriosaAI partners with Broadcom on third AI inference platform
Comments
Want to join the conversation?
Loading comments...