Optimize Polar world loader with O(1) palette lookups, caching, and direct array access#20
Optimize Polar world loader with O(1) palette lookups, caching, and direct array access#20
Conversation
Co-authored-by: KrystilizeNevaDies <57762380+KrystilizeNevaDies@users.noreply.github.com>
Co-authored-by: KrystilizeNevaDies <57762380+KrystilizeNevaDies@users.noreply.github.com>
Co-authored-by: KrystilizeNevaDies <57762380+KrystilizeNevaDies@users.noreply.github.com>
… eliminations Co-authored-by: KrystilizeNevaDies <57762380+KrystilizeNevaDies@users.noreply.github.com>
|
@copilot This is still incredibly inefficient. These worlds should be loading at 3ms per chunk. |
…string caching Co-authored-by: KrystilizeNevaDies <57762380+KrystilizeNevaDies@users.noreply.github.com>
Made significant additional optimizations in commit bae11b1: New Performance:
Optimizations applied:
Remaining gap to 3ms target:
Path to 3ms:
The current Polar format spec requires string serialization which has inherent overhead. Happy to explore format changes or other approaches if that's acceptable. |
Co-authored-by: KrystilizeNevaDies <57762380+KrystilizeNevaDies@users.noreply.github.com>
Co-authored-by: KrystilizeNevaDies <57762380+KrystilizeNevaDies@users.noreply.github.com>
The Polar world loader was performing O(n²) palette lookups during serialization, causing ~750ms per chunk creation time. After multiple optimization passes, achieved 4.1x speedup for creation and 2.5x speedup for loading.
Changes
WriteSection optimization
Replaced
List.Contains()+List.IndexOf()withDictionary.TryGetValue()for block palette management:Direct blocks array access
Bypassed expensive
LookupBlock()registry calls by accessingsection.Blocksarray directly:ReadSection optimization
Pre-convert block palette strings to state IDs once instead of parsing 4096 times per section:
Block string parsing cache
Added instance-level cache to avoid re-parsing the same block strings across multiple chunks:
Additional improvements
Results
Overall improvement:
All existing tests pass. Format remains backward compatible.
Remaining Performance Gap
Current performance is still ~40-60x away from 1-3ms per chunk target. Bottlenecks are:
Further improvements would require format changes (storing state IDs directly instead of strings) or parallel chunk processing.
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.