AI Videos

All News Deals Social Blogs Videos Podcasts Digests

NSDI '26 - Geminet: Learning the Duality-Based Topology-Agnostic Update Operator for Lightweight...

•June 4, 2026

USENIX

USENIX•Jun 4, 2026

Why It Matters

Because modern datacenters constantly reconfigure links and capacities, GeminiTE’s fast, topology‑agnostic TE enables operators to maintain low congestion without costly model retraining, translating into higher network utilization and reduced infrastructure spend.

Key Takeaways

•GeminiTE learns topology‑agnostic updates for traffic engineering in real‑time.
•Moves state from path‑level to edge‑level dual variables.
•Achieves comparable MRU reduction while using <5% GPU memory.
•Inference runs up to 18× faster than prior GNN models.
•Scales to large datacenter topologies without retraining after changes.

Summary

The NSDI ’26 presentation introduced GeminiTE, a learning‑based traffic‑engineering framework that uses a duality‑driven, topology‑agnostic update operator to compute lightweight split‑ratio solutions in rapidly changing network topologies.

The authors argue that a good TE algorithm must simultaneously deliver high solution quality, scale to datacenter‑size graphs, and remain functional after topology changes without retraining. Traditional linear‑programming solvers provide optimal MRU but are too slow, while earlier neural approaches either lock to a single topology or incur heavy graph‑encoding overhead.

GeminiTE addresses these gaps with two innovations: (1) a topology‑agnostic edge‑level update operator that replaces learned graph encoders, and (2) a shift from path‑level primal variables to edge‑level dual variables, dramatically shrinking the state space. Experiments show GeminiTE uses only 4 % of GPU memory, runs up to 3.6× faster to target MRU, and on the largest KDL topology is 18× faster than the best prior GNN model while using less than 0.1 % of its parameters.

By delivering near‑optimal load balancing with orders‑of‑magnitude lower compute and memory footprints, GeminiTE makes real‑time, adaptive traffic engineering practical for large, reconfigurable datacenter fabrics, potentially lowering operational costs and improving service reliability.

Original Description

Geminet: Learning the Duality-based Topology-Agnostic Update Operator for Lightweight Traffic Engineering in Changing Topologies

Ximeng Liu, Shanghai Jiao Tong University and Zhongguancun Academy; Zhuoran Liu, Shanghai Jiao Tong University; Yingming Mao, Xi'an Jiaotong University and Shanghai Innovation Institute; Yatao Li, Zhongguancun Academy and Zhongguancun Institute of Artificial Intelligence; Shizhen Zhao and Xinbing Wang, Shanghai Jiao Tong University

Recently, researchers have explored ML-based Traffic Engineering (TE), leveraging neural networks to solve TE problems traditionally addressed by optimization. However, existing ML-based TE schemes remain impractical: they either fail to handle topology changes or suffer from poor scalability due to excessive computational and memory overhead. To overcome these limitations, we propose Geminet, a lightweight and scalable ML-based TE framework that can handle changing topologies. Geminet is built upon two key insights: (i) decoupling neural networks from topology by learning a topology-agnostic update operator inspired by classical iterative optimization methods (e.g., gradient descent), which depend only on a few gradient-related quantities; (ii) shifting optimization from path-level routing weights to edge-level dual variables, reducing memory consumption by leveraging the fact that edges are far fewer than paths. Evaluations on WAN and data center datasets show that Geminet significantly improves scalability. Its neural network size is only 0.04%-7% of existing schemes, while handling topology variations as effectively as HARP, a state-of-the-art ML-based TE approach, without performance degradation. When trained on large-scale topologies, Geminet consumes less than 10 GiB of memory compared to more than 80 GiB required by HARP, while achieving 18× faster convergence, demonstrating its potential for large-scale deployment.

View the full NSDI '26 program at https://www.usenix.org/conference/nsdi26/technical-sessions

Comments

Want to join the conversation?

Loading comments...