Accelerating Computational Lithography Using Massively Parallel GPU Rasterizer

Accelerating Computational Lithography Using Massively Parallel GPU Rasterizer

SemiWiki
SemiWikiMar 18, 2026

Key Takeaways

  • GPU rasterizer achieves up to 290× speedup on Manhattan layouts
  • Fractional pixel coverage computed with floating‑point precision
  • Error remains below 1% versus CPU reference
  • Atomic operations preserve connectivity across overlapping polygons
  • Implementation targets NVIDIA H100, leveraging CUDA and high bandwidth

Summary

Siemens EDA unveiled a GPU‑accelerated rasterization algorithm that transforms computational lithography workflows. By decomposing layouts into tiles and processing them on NVIDIA H100 GPUs, the method attains speedups of up to 290× for Manhattan geometries and 45× for curvilinear designs. The approach maintains sub‑nanometer accuracy, keeping absolute error below 1% compared with traditional CPU rasterizers. Faster rasterization shortens OPC and mask synthesis cycles, boosting yield and time‑to‑market.

Pulse Analysis

Computational lithography has become a cornerstone of advanced chip design, yet its simulation pipelines are hampered by rasterization—a step that converts vector layouts into ultra‑high‑resolution pixel grids. Traditional CPU‑based rasterizers struggle with the exploding polygon counts and the need for fractional pixel coverage, leading to long runtimes and potential yield‑impacting errors. As nodes shrink below a few nanometers, the demand for nanometer‑scale precision and massive throughput has intensified, prompting researchers to explore parallel hardware solutions.

The Siemens EDA whitepaper introduces a GPU‑first rasterization pipeline that rethinks the problem as a massively parallel task. Layouts are pre‑processed on the CPU, tiled, and then streamed to NVIDIA H100 GPUs where each tile is handled by dedicated thread blocks. By leveraging shared memory, coalesced accesses, and atomic accumulation, the algorithm computes exact fractional coverage for each pixel while preserving sub‑pixel connectivity. Benchmarks show up to 290× speedup on Manhattan‑heavy designs and 45× on complex curvilinear patterns, all while keeping absolute error under 1%—a level of fidelity that satisfies the strict tolerances of modern lithography.

For the semiconductor ecosystem, these performance gains translate into tangible business value. Shorter rasterization times enable more OPC iterations within a given design window, improving mask correction quality and ultimately boosting wafer yield. The GPU‑centric approach also aligns with data‑center trends, allowing EDA vendors to integrate heterogeneous CPU‑GPU workflows without sacrificing accuracy. As GPU architectures continue to evolve, the scalability of this solution positions it as a future‑proof component of the lithography stack, accelerating time‑to‑market for next‑generation chips.

Accelerating Computational Lithography Using Massively Parallel GPU Rasterizer

Comments

Want to join the conversation?