From bfbd38412d86aab05305ad1537401f240e338531 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 30 Dec 2025 02:10:40 +0000 Subject: [PATCH 1/2] Initial plan From 2c3e39e83484157665b6506690f91b0fd5541eb1 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 30 Dec 2025 02:18:27 +0000 Subject: [PATCH 2/2] Add benchmarks section to website Co-authored-by: jaypatrick <1800595+jaypatrick@users.noreply.github.com> --- src/website/src/components/Layout.js | 3 + src/website/src/pages/benchmarks.js | 348 +++++++++++++++++++++++++++ src/website/src/pages/index.js | 6 + 3 files changed, 357 insertions(+) create mode 100644 src/website/src/pages/benchmarks.js diff --git a/src/website/src/components/Layout.js b/src/website/src/components/Layout.js index fdbf0024..6e9846e9 100644 --- a/src/website/src/components/Layout.js +++ b/src/website/src/components/Layout.js @@ -60,6 +60,9 @@ const Layout = ({ children, pageTitle }) => {
+ Measure and understand the performance characteristics of the filter + rule compilers, including parallel chunking speedups and cross-language + comparisons. +
+ ++ The repository includes comprehensive benchmarking tools to help you + understand performance across different compilers and optimize your + compilation workflows. All compilers (TypeScript, .NET, Python, Rust) + support parallel chunking for improved performance with large filter + lists. +
+
+ File: benchmarks/quick_benchmark.py
+
+ Fast simulation showing expected speedups without requiring full + compilation setup. Demonstrates: +
+
+ Files: benchmarks/run_benchmarks.py,{" "}
+ generate_synthetic_data.py
+
+ Complete benchmarking across all compilers with real compilation. + Compares sequential vs chunked/parallel performance using + synthetic test data. +
++ Run a quick simulation to see expected speedups on your system: +
++ cd benchmarks ++ +
+
+ # Run comparison suite (recommended) +
+ python quick_benchmark.py --suite +
+
+ # Run parallel scaling test +
+ python quick_benchmark.py --scaling +
+
+ # Custom benchmark +
+ python quick_benchmark.py --rules 500000 --parallel 8 +
+
+ # Interactive mode +
+ python quick_benchmark.py --interactive +
+ Generate synthetic test data and run actual compilation benchmarks: +
++ cd benchmarks ++
+
+ # Generate test data (small, medium, large, xlarge filter lists) +
+ python generate_synthetic_data.py --all +
+
+ # Run benchmarks across all compilers +
+ python run_benchmarks.py +
+
+ # Run specific compiler only +
+ python run_benchmarks.py --compiler python --iterations 5 +
+
+ # Run specific size only +
+ python run_benchmarks.py --size large +
+ Performance varies by hardware, I/O speed, and network latency, but + here are typical results from synthetic benchmarks: +
+ +| + Rule Count + | ++ Sequential + | ++ 4 Workers + | ++ 8 Workers + | ++ Speedup (8w) + | +
|---|---|---|---|---|
| 10,000 | +~150ms | +~60ms | +~40ms | +3.75x | +
| 50,000 | +~600ms | +~200ms | +~120ms | +5.0x | +
| 200,000 | +~2.5s | +~800ms | +~400ms | +6.25x | +
| 500,000 | +~6s | +~1.8s | +~900ms | +6.67x | +
+ Speedup scales with CPU cores but with diminishing returns due to + overhead, merge time, and I/O contention: +
+ +| + Workers + | ++ Theoretical Max + | ++ Typical Efficiency + | +
|---|---|---|
| 2 | +2.0x | +90-100% | +
| 4 | +4.0x | +85-95% | +
| 8 | +8.0x | +75-90% | +
| 16 | +16.0x | +60-80% | +
+ Efficiency decreases due to process startup overhead, merge/deduplication + time, memory bandwidth limits, and I/O contention. +
++ Parallel chunking provides the most benefit for large filter lists + with multiple sources: +
+ ++ Here's what you might see from the quick benchmark suite: +
+
+{`CHUNKING PERFORMANCE COMPARISON SUITE
+======================================================================
+CPU cores available: 8
+Max parallel workers: 8
+
+Size Sequential Parallel Speedup Efficiency
+----------------------------------------------------------------------
+10K rules 150 ms 70 ms 2.14x 27%
+50K rules 570 ms 130 ms 4.38x 55%
+200K rules 2,350 ms 350 ms 6.71x 84%
+500K rules 5,400 ms 800 ms 6.75x 84%
+----------------------------------------------------------------------
+
+Average speedup: 5.00x
+Maximum speedup: 6.75x`}
+
+ + Complete documentation on parallel chunking including + configuration, API reference, and best practices. +
++ Compare the different compiler implementations and choose the + best one for your needs. +
++ Explore the benchmark scripts on GitHub to understand the + implementation details. +
++ Run benchmarks on your actual hardware to get accurate performance + data for your specific use case. Results vary based on CPU cores, + memory, I/O speed, and network latency. +
+Measure and optimize compilation performance with benchmarking tools.
+