Hardware Videos

All News Deals Social Blogs Videos Podcasts Digests

Hardware AI Semiconductors

FTS AI/HPC Lightning Talks

•May 18, 2026

Open Compute Project

Open Compute Project•May 18, 2026

Why It Matters

These technologies collectively address the data‑center power bottleneck, enabling faster, cheaper AI workloads and safeguarding the economic viability of hyperscale compute expansion.

Key Takeaways

•Optical compute promises 100x performance per watt for AI inference.
•Lumi Iris server can run billion‑parameter LLMs at 100 TOPS/W.
•Lossless compression yields 1.3‑1.5× memory bandwidth savings without retraining.
•Real‑world benchmarks show up to 35% cost variation across data centers.
•Dual‑path power regulation can improve GPU energy efficiency by 30%.

Summary

The OCP AI/HPC Lightning Talks showcased emerging solutions aimed at breaking the power wall in modern data centers. Phil from Lumi introduced optical compute, explaining how encoding vectors as light and matrix weights as transmissive pixels enables matrix‑multiply‑accumulate operations with near‑zero energy cost, and unveiled the Iris server capable of running billion‑parameter LLMs at roughly 100 TOPS per watt.

Key technical insights included the quadratic scaling of performance with vector width, allowing efficiency gains as systems grow, and the ability to scale matrices up to 48 × 2048. Nish of Zero Point highlighted lossless cache‑line compression that delivers 1.3‑1.5× reduction in model and KV‑cache size without retraining, directly boosting memory bandwidth. Ash and Andrew from Flops presented real‑world benchmark data revealing a 35% cost variation across 60,000 units—far higher than the 8% suggested by synthetic metrics—underscoring inefficiencies in workload placement. Tanner Do of VTEC described a dual‑path voltage regulation scheme that trims guard‑band voltage droop, translating to a 30% improvement in tokens‑per‑watt for GPU workloads.

Notable quotes reinforced the narrative: Phil noted, “Performance grows with the square of vector width while power grows linearly,” and Nish emphasized, “Lossless compression gives 1.3‑1.5× bandwidth gains without accuracy loss.” Ash warned, “Benchmarks miss up to 35% real‑world cost variation,” while Tanner claimed, “Our active compensation yields 30% more energy efficiency in the GPU power domain.”

The implications are profound: optical compute could redefine AI inference power envelopes, lossless compression offers a near‑term memory‑bandwidth lever, accurate benchmarking can unlock billions in operational savings, and smarter power regulation directly boosts GPU utilization. Together, these innovations promise to accelerate the path toward the hyperscalers’ goal of a thousand‑fold compute increase without prohibitive capital or energy expenditure.

Original Description

Presenter(s):

Phillip Burr, Head of Product, Lumai

Nilesh Shah, VP Business Development, zeropoint technologies

Ash Chary, Founder, Flops Index

Andrew Nelson, Head of Research, FLOPS

Taner Dosluoglu, CEO, weeteq

Timour Paltachev, Chief Science and Product Officer, Rivvor

Konstantin Tiutin, CTO, RIVVOR Inc.

Datacenter-Scale Optical AI Inference: Demonstrating a Deployable Optical Server for Billion-Parameter-Scale LLMs by Phillip Burr

Enhancing AI Data Center Efficiency: OCP OFP8 Lossless Compression for LLM Inference by Nilesh Shah

Market Inefficiency in GPU Infrastructure: The Case for Workload-Specific Performance Transparency by Ash Chary andAndrew Nelson

How to Increase Tokens per Watt by 30 percent by Taner Dosluoglu

Cable-Free Physical Layers for Rack-Scale AI Systems: System-Level Validation of Short-Range Wireless Interconnects by Timour Paltachev & Konstatine Tiutin

Comments

Want to join the conversation?

Loading comments...