Experimental GPU-driven rendering by AzurIce · Pull Request #138 · AzurIce/ranim

AzurIce · 2026-02-23T06:07:14Z

Closes #139

Summary

Merge all VItem data into a single set of contiguous GPU buffers and use instanced drawing to render all VItems in one draw call, eliminating per-VItem CPU submission overhead.

Approach

Each frame, pack all VItem points, colors, and widths into contiguous buffers with an ItemInfo index table
Compute shader binary-searches item_infos to find each point's owning item, performs 3D→2D projection + atomic clip box updates
Render passes use draw(0..4, 0..N) instanced drawing; vertex shader looks up per-item data via instance_index

New files

primitives/merged_vitem.rs — MergedVItemBuffer: CPU-side data packing, bind group management
pipelines/merged_vitem.rs — compute / depth / color pipeline definitions
shaders/merged_vitem_compute.wgsl — merged compute shader (binary search + atomic clip box)
shaders/merged_vitem.wgsl — merged render shader (instanced vertex + SDF fragment)
Renderer::render_store_merged() — merged rendering entry point that bypasses the render graph

Benchmark comparison

The original rendering path is kept intact for A/B comparison.

CPU submission time (the bottleneck)

VItem count	Original (per-VItem)	Merged	Speedup
25	1.61 ms	1.64 ms	~1x
100	4.14 ms	1.70 ms	2.4x
400	25.2 ms	1.79 ms	14x
1600	92.7 ms	1.86 ms	50x
3600	220 ms	1.90 ms	116x

Total time (CPU + GPU)

VItem count	Original	Merged	Speedup
25	5.6 ms	3.8 ms	1.5x
100	8.9 ms	6.5 ms	1.4x
400	22.3 ms	5.9 ms	3.8x
1600	88.8 ms	4.6 ms	19x
3600	256 ms	5.0 ms	51x

CPU submission time is now essentially constant (~1.9ms) regardless of VItem count. At 3600 VItems, total frame time drops from 256ms to 5ms.

Visual correctness

Both paths produce identical output for the same scene (including OIT transparency and depth ordering).

fix: #140 , related: #139, #138

AzurIce added 2 commits February 23, 2026 13:22

update benchmark

27ecb5d

experimental GPU-driven rendering

11e9ba6

AzurIce mentioned this pull request Feb 23, 2026

GPU-driven rendering: merge GPU buffers to eliminate per-VItem CPU submission overhead #139

Closed

AzurIce added 2 commits February 23, 2026 17:45

rendergraph of GPU-driven pipeline

f25ae15

lints

26e0d26

AzurIce force-pushed the feat/merged-gpu-buffer branch from 30be894 to 26e0d26 Compare February 23, 2026 10:07

AzurIce marked this pull request as ready for review February 23, 2026 10:08

AzurIce changed the title ~~[WIP] Experimental GPU-driven rendering~~ Experimental GPU-driven rendering Feb 24, 2026

AzurIce merged commit 9e2b612 into main Feb 24, 2026
4 checks passed

AzurIce mentioned this pull request Feb 24, 2026

GPU-driven rendering still do per-item alloc #140

Closed

Copilot AI mentioned this pull request Feb 24, 2026

Remove old per-item rendering pipeline and consolidate into GPU-driven path #141

Closed

AzurIce mentioned this pull request Feb 24, 2026

Removed per-item pipeline #142

Merged

AzurIce added a commit that referenced this pull request Feb 24, 2026

Removed per-item pipeline (#142)

232c488

fix: #140 , related: #139, #138

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental GPU-driven rendering#138

Experimental GPU-driven rendering#138
AzurIce merged 4 commits intomainfrom
feat/merged-gpu-buffer

AzurIce commented Feb 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AzurIce commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

New files

Benchmark comparison

CPU submission time (the bottleneck)

Total time (CPU + GPU)

Visual correctness

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AzurIce commented Feb 23, 2026 •

edited

Loading