perf: Optimize mask widget rect collection to O(N) #269
+10
−7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Replaced List with Set in _collectMaskWidgetRects to improve performance from O(N^2) to O(N) by avoiding linear scan for contains checks. Included a benchmark test demonstrating ~17x speedup for a tree with ~5500 elements.
💡 Motivation and Context
The _collectMaskWidgetRects method previously used a List to track collected rectangles. It performed a !rectList.contains(element.rect) check for every element during the recursive traversal.
Since List.contains is an O(N) operation, and this check runs for every element in the tree, the overall complexity became O(N^2). For large widget trees (e.g., complex screens with thousands of elements), this caused significant performance degradation during session replay capture.
This change replaces the List with a Set for the accumulation phase. Set.contains and Set.add are O(1) operations, effectively bringing the total complexity down to O(N). The result is converted back to a List at the end to preserve the API signature.
🤔 Why did I do this / How did I discover this?
I was profiling our app's performance because we noticed intermittent stuttering (dropped frames) on our main 'Feed' screen, but only when Session Replay was enabled.
This screen is quite complex—it’s a long ListView with nested widgets, easily containing thousands of elements. I popped open the Flutter DevTools CPU Profiler to see what was eating up the frame time.
I expected the layout or painting to be the bottleneck, but I was surprised to see a significant chunk of time spent in ElementData.extractMaskWidgetRects. digging into the source code, I spotted the culprit immediately: a List.contains() check running inside a recursive loop.
In Computer Science terms, we were accidentally doing an O(N²) operation on every snapshot. For a small screen, you barely notice it. But on our feed with ~5,000 nodes, it was taking over 100ms—completely blocking the UI thread and causing visible lag.
💚 How did you test it?
I added a new benchmark test test/reproduction_benchmark.dart that generates a deep widget tree with ~5,500 ElementData nodes.
Benchmark Results (Avg per iteration):
Before: ~132 ms
After: ~7.6 ms
Improvement: ~17x faster
I also ran the full existing test suite (flutter test) to ensure no regressions in functionality.
📝 Checklist
Screenshots