Skip to content

Experimental GPU-driven rendering#138

Merged
AzurIce merged 4 commits intomainfrom
feat/merged-gpu-buffer
Feb 24, 2026
Merged

Experimental GPU-driven rendering#138
AzurIce merged 4 commits intomainfrom
feat/merged-gpu-buffer

Conversation

@AzurIce
Copy link
Owner

@AzurIce AzurIce commented Feb 23, 2026

Closes #139

Summary

Merge all VItem data into a single set of contiguous GPU buffers and use instanced drawing to render all VItems in one draw call, eliminating per-VItem CPU submission overhead.

Approach

  • Each frame, pack all VItem points, colors, and widths into contiguous buffers with an ItemInfo index table
  • Compute shader binary-searches item_infos to find each point's owning item, performs 3D→2D projection + atomic clip box updates
  • Render passes use draw(0..4, 0..N) instanced drawing; vertex shader looks up per-item data via instance_index

New files

  • primitives/merged_vitem.rsMergedVItemBuffer: CPU-side data packing, bind group management
  • pipelines/merged_vitem.rs — compute / depth / color pipeline definitions
  • shaders/merged_vitem_compute.wgsl — merged compute shader (binary search + atomic clip box)
  • shaders/merged_vitem.wgsl — merged render shader (instanced vertex + SDF fragment)
  • Renderer::render_store_merged() — merged rendering entry point that bypasses the render graph

Benchmark comparison

The original rendering path is kept intact for A/B comparison.

CPU submission time (the bottleneck)

VItem count Original (per-VItem) Merged Speedup
25 1.61 ms 1.64 ms ~1x
100 4.14 ms 1.70 ms 2.4x
400 25.2 ms 1.79 ms 14x
1600 92.7 ms 1.86 ms 50x
3600 220 ms 1.90 ms 116x

Total time (CPU + GPU)

VItem count Original Merged Speedup
25 5.6 ms 3.8 ms 1.5x
100 8.9 ms 6.5 ms 1.4x
400 22.3 ms 5.9 ms 3.8x
1600 88.8 ms 4.6 ms 19x
3600 256 ms 5.0 ms 51x

CPU submission time is now essentially constant (~1.9ms) regardless of VItem count. At 3600 VItems, total frame time drops from 256ms to 5ms.

Visual correctness

Both paths produce identical output for the same scene (including OIT transparency and depth ordering).

@AzurIce AzurIce force-pushed the feat/merged-gpu-buffer branch from 30be894 to 26e0d26 Compare February 23, 2026 10:07
@AzurIce AzurIce marked this pull request as ready for review February 23, 2026 10:08
@AzurIce AzurIce changed the title [WIP] Experimental GPU-driven rendering Experimental GPU-driven rendering Feb 24, 2026
@AzurIce AzurIce merged commit 9e2b612 into main Feb 24, 2026
4 checks passed
AzurIce added a commit that referenced this pull request Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GPU-driven rendering: merge GPU buffers to eliminate per-VItem CPU submission overhead

1 participant