bddicken · PEZ · Jan 14, 2025 · Jan 14, 2025 · Jan 14, 2025 · Jan 14, 2025
diff --git a/.gitignore b/.gitignore
@@ -34,3 +34,14 @@ Cargo.lock
 levenshtein.mod
 */racket/compiled/
 *.zo
+*/clojure-native-image/run
+*/java-native-image/jvm.run
+*/java-native-image/run
+loops/*/code
+loops/*/run
+fibonacci/*/code
+fibonacci/*/run
+levenshtein/*/code
+levenshtein/*/run
+hello-world/*/code
+hello-world/*/run
diff --git a/README.md b/README.md
@@ -5,7 +5,162 @@ A repo for collaboratively building small benchmarks to compare languages.
 If you have a suggestion for improvement: PR!
 If you want to add a language: PR!
 
-## Running
+You are also welcome to add new top-level benchmarks dirs
+
+## New runner
+
+There's a new runner system that is supposed to replace the old one. The main goal is to eliminate start times from the benchmarks. The general strategy is that the programs being benchmarked do the benchmarking in-process, and only around the single piece of work that the benchmark is about. So for **fibonacci** only the call to the function calculating the fibonacci sum should be measured. Additionally each program (language) will be allowed the same amount of time to complete the benchmark work (as many times as it can).
+
+For this, each language will have to have some minimal utility/tooling for running the function-under-benchmark as many times as a timeout allows, plus reporting the measurements and the result. Here are two implementations, that we can regard as being reference:
+
+* [benchmark.clj](lib/clojure/src/languages/benchmark.clj)
+* [benchmark.java](lib/java/languages/Benchmark.java)
+* [benchmark.c](lib/c/benchmark.c) (This one may need some scrutiny from C experts before we fully label it as *reference*.)
+
+You'll see that the `benchmark/run` function takes two arguments:
+
+1. `f`: A function (a thunk)
+1. `run-ms`: A total time in milliseconds within which the function should be run as many times as possible
+
+To make the overhead of running and measuring as small as possible, the runner takes a delta time for each time it calls `f`. It is when the sum of these deltas, `total-elapsed-time`, is over the `run-ms` time that we stop calling `f`. So, for a `run-ms` of `1000` the total runtime will always be longer than a second. Because we will almost always “overshoot” with the last run, and because the overhead of running and keeping tally, even if tiny, will always be _something_.
+
+The benchmark/run function is responsible to report back the result/answer to the task being benchmarked, as well as some stats, like mean run time, standard deviation, min and max times, and how many runs where completed.
+
+### Running a benchmark
+
+The new run script is named [run-benchmark.sh](run-benchmark.sh). Let's say we run it in the **levenstein** directory:
+
+```sh
+../run-benchmark.sh -u PEZ
+```
+
+The default run time is `10000` ms. `-u` sets the user name (preferably your GitHub handle). The output was this:
+
+```csv
+benchmark,timestamp,commit_sha,is_checked,user,model,ram,os,arch,language,run_ms,mean_ms,std-dev-ms,min_ms,max_ms,runs
+levenshtein,2025-01-18T23:32:41Z,8e63938,true,PEZ,Apple M4 Max,64GB,darwin24,arm64,Babashka,10000,23376.012916,0.0,23376.012916,23376.012916,1
+levenshtein,2025-01-18T23:32:41Z,8e63938,true,PEZ,Apple M4 Max,64GB,darwin24,arm64,C,10000,31.874277,0.448673,31.286000,35.599000,314
+levenshtein,2025-01-18T23:32:41Z,8e63938,true,PEZ,Apple M4 Max,64GB,darwin24,arm64,Clojure,10000,57.27048066857143,2.210445845051782,55.554958,75.566792,175
+levenshtein,2025-01-18T23:32:41Z,8e63938,true,PEZ,Apple M4 Max,64GB,darwin24,arm64,Clojure Native,10000,59.95592388622754,0.8493245545620596,58.963833,62.897834,167
+levenshtein,2025-01-18T23:32:41Z,8e63938,true,PEZ,Apple M4 Max,64GB,darwin24,arm64,Java,10000,55.194704,1.624322,52.463125,63.390833,182
+levenshtein,2025-01-18T23:32:41Z,8e63938,true,PEZ,Apple M4 Max,64GB,darwin24,arm64,Java Native,10000,60.704966,6.579482,51.807750,96.343541,165
+```
+
+It's a CSV file you can open in something Excel-ish, or consume with your favorite programming language.
+
+![Example Result CSV in Numbers.app](docs/example-results-csv.png)
+
+As you can see, it has some meta data about the run, in addition to the benchmark results. **Clojure** ran the benchmark 175 times, with a mean time of **57.3 ms**. Which shows the point with the new runner, considering that Clojure takes **300 ms** (on the same machine) to start.
+
+See [run-benchmark.sh](run-benchmark.sh) for some more command line options it accepts. Let's note one of them: `-l` which takes a string of comma separated language names, and only those languages will be run. Good for when contributing a new language or updates to a language. E.g:
+
+```
+~/Projects/languages/levenshtein ❯ ../run-benchmark.sh -u PEZ -l Clojure
+Running levenshtein benchmark...
+Results will be written to: /tmp/languages-benchmark/levenshtein_PEZ_10000_5bb1995_only_langs.csv
+
+Checking levenshtein Clojure
+Check passed
+Benchmarking levenshtein Clojure
+java -cp clojure/classes:src:/Users/pez/.m2/repository/org/clojure/clojure/1.12.0/clojure-1.12.0.jar:/Users/pez/.m2/repository/org/clojure/core.specs.alpha/0.4.74/core.specs.alpha-0.4.74.jar:/Users/pez/.m2/repository/org/clojure/spec.alpha/0.5.238/spec.alpha-0.5.238.jar run 10000 levenshtein-words.txt
+levenshtein,5bb1995,true,PEZ,Apple M4 Max,darwin24,arm64,Clojure,10000,56.84122918181818,0.8759056030546785,55.214541,59.573,176
+
+Done running levenshtein benchmark
+Results were written to: /tmp/languages-benchmark/levenshtein_PEZ_10000_5bb1995_only_langs.csv
+```
+
+### Compiling a benchmark
+
+This works as before, but since the new programs are named `run` instead of `code`, we need a new script. Meet: [compile-benchmark.sh](compile-benchmark.sh)
+
+```sh
+../compile-benchmark.sh
+```
+
+### Adding a language
+
+To add a language for a benchmark to the new runner you'll need to add:
+
+1. A benchmarking utility in `lib/<language>`
+1. Code in `<benchmark>/<language>/run.<language-extension>` (plus whatever extra project files)
+1. An entry in `compile-benchmark.sh`
+1. An entry in `run-benchmark.sh`
+1. Maybe some code in `clean.sh`
+
+The `main` function of the program provided should take two arguments:
+
+1. The run time in milliseconds
+1. The input to the function
+   - There is only one input argument, unlike before. How this input argument should be interpreted depends on the benchmark. For **levenshtein** it is a file path, to the file containing the words to use for the test.
+
+As noted before the program should run the function-under-benchmark as many times as it can, following the example of the reference implementations mentioned above. The program is allowed to use an equal amount of time as the run time for warmup, so that any JIT compilers will have had some chance to optimize.
+
+The program should output a csv row with:
+
+```csv
+mean_ms,std-dev-ms,min_ms,max_ms,times,result
+```
+
+### Some changes to the benchmarks:
+
+* **fibonacci**: 
+  * The program should return the result of `fib(n)`. This is to keep the benchmark focused on one thing.
+  * Early exit for `n < 2` are now allowed, again to keep the benchmark focused.
+  * The input is now `37`, to allow slower languages to complete more runs.
+* **loops**: The inner loop is now 10k, again to allow slower languages to complete more runs.
+* **levenshtein**:
+  1. Smaller input (slower languages...)
+  1. We only calculate each word pairing distance once (A is as far from B as B is from A)
+  1. There is a single result, the sum of the distances.
+* **hello-world**: No changes.
+  * It needs to accept and ignore the two arguments
+  * There is no benchmarking code in there, because it will be benchmarked out-of-process, using **hyperfine**
+
+Let's look at the `-main` function for the Clojure **levenshtein** contribution:
+
+```clojure
+(defn -main [& args]
+  (let [run-ms (parse-long (first args))
+        input-path (second args)
+        strings (-> (slurp input-path)
+                    (string/split #"\s+"))
+        _warmup (benchmark/run #(levenshtein-distances strings) run-ms)
+        results (benchmark/run #(levenshtein-distances strings) run-ms)]
+    (-> results
+        (update :result (partial reduce +))
+        benchmark/format-results
+        println)))
+```
+
+The `benchmark/run` function returns a map with the measurements and the result keyed on `:result`. *This result is a sequence of all the distances.* Outside the benchmarked function we sum the distances, and then format the output with this sum. It's done this way to minimize the impact that the benchmarking needs has on the benchmarked work. (See [levenshtein/jvm/run.java](levenshtein/jvm/run.java) or [levenshtein/c/run.c](levenshtein/c/run.c) if the Lisp is tricky to read for you.)
+
+### You can help
+
+Please consider helping us making a speedy transition by porting your favorite language(s) from the [old runner](#running-legacy) to this new one.
+
+## Available Benchmarks
+
+#### [hello-world](./hello-world/README.md)
+
+#### [loops](./loops/README.md)
+
+#### [fibonacci](./fibonacci/README.md)
+
+#### [levenshtein](./levenshtein/README.md)
+
+## Corresponding visuals
+
+Several visuals have been published based on the work here.
+More will likely be added in the future, as this repository improves:
+
+- https://benjdd.com/languages
+- https://benjdd.com/languages2
+- https://benjdd.com/languages3
+- https://pez.github.io/languages-visualizations/ 
+  - check https://github.com/PEZ/languages-visualizations/tags for tags, which correspond to a snapshot of some particular benchmark run: e.g:
+  - https://pez.github.io/languages-visualizations/v2024.12.31/
+
+## Running (Legacy)
 
 To run one of the benchmarks:
 
@@ -41,36 +196,12 @@ To run one of the benchmarks:
 
 Hyperfine is used to warm, execute, and time the runs of the programs.
 
-## Adding
+## Adding (Legacy)
 
 To add a language:
 
 1. Select the benchmark directory you want to add to (EG `$ cd loops`)
 2. Create a new subdirectory for the language (EG `$ mkdir rust`)
 3. Implement the code in the appropriately named file (EG: `code.rs`)
 4. If the language is compiled, add appropriate command to `../compile.sh` and `../clean.sh`
-5. Add appropriate line to `../run.sh`
-
-You are also welcome to add new top-level benchmarks dirs
-
-# Available Benchmarks
-
-### [hello-world](./hello-world/README.md)
-
-### [loops](./loops/README.md)
-
-### [fibonacci](./fibonacci/README.md)
-
-### [levenshtein](./levenshtein/README.md)
-
-# Corresponding visuals
-
-Several visuals have been published based on the work here.
-More will likely be added in the future, as this repository improves:
-
-- https://benjdd.com/languages
-- https://benjdd.com/languages2
-- https://benjdd.com/languages3
-- https://pez.github.io/languages-visualizations/ 
-  - check https://github.com/PEZ/languages-visualizations/tags for tags, which correspond to a snapshot of some particular benchmark run: e.g:
-  - https://pez.github.io/languages-visualizations/v2024.12.31/
+5. Add appropriate line to `../run.sh`
diff --git a/clean.sh b/clean.sh
@@ -1,11 +1,9 @@
 rm c3/code
-rm c/code
+rm c/{code,run}
 rm cpp/code
 rm go/code
-rm jvm/*.class
-rm java-native-image/code
-rm java-native-image/jvm.code
-rm java-native-image/default.iprof
+rm -rf jvm/{*.class,*.iprof}
+rm -rf java-native-image/{jvm.run,run,code,jvm.code,*.iprof}
 rm scala/code scala/code-native
 rm -r rust/target
 rm -rf kotlin/code.jar
@@ -40,7 +38,7 @@ rm hare/code
 rm v/code
 rm emojicode/code emojicode/code.o
 rm -f chez/code.so
-rm -rf clojure/classes clojure/.cpcache
-rm -rf clojure-native-image/classes clojure-native-image/.cpcache clojure-native-image/code
+rm -rf clojure/{classes,.cpcache,*.class}
+rm -rf clojure-native-image/{classes,code,run,*.iprof}
 rm cobol/main
 rm emacs-lisp/code.eln emacs-lisp/code.elc
diff --git a/compile-benchmark.sh b/compile-benchmark.sh
@@ -0,0 +1,21 @@
+function compile {
+  if [ -d ${2} ]; then
+    echo ""
+    echo "Compiling $1"
+    eval "${3}"
+    result=$?
+    if [ $result -ne 0 ]; then
+        echo "Failed to compile ${1} with command: ${3}"
+    fi
+  fi
+}
+
+# Please keep in language name alphabetic order
+# run "Language name" "File that should exist" "Command line"
+####### BEGIN The languages
+compile 'C' 'c' 'gcc -O3 -I../lib/c -c ../lib/c/benchmark.c -o c/benchmark.o && gcc -O3 -I../lib/c c/benchmark.o c/run.c -o c/run -lm'
+compile 'Clojure' 'clojure' '(cd clojure && mkdir -p classes && clojure -M -e "(compile (quote run))")'
+compile 'Clojure Native' 'clojure-native-image' "(cd clojure-native-image ; clojure -M:native-image-run --pgo-instrument -march=native) ; ./clojure-native-image/run -XX:ProfilesDumpFile=clojure-native-image/run.iprof 10000 $(./check-output.sh -i) && (cd clojure-native-image ; clojure -M:native-image-run --pgo=run.iprof -march=native)"
+compile 'Java' 'jvm' 'javac -cp ../lib/java jvm/run.java'
+compile 'Java Native' 'java-native-image' "(cd java-native-image ; native-image -cp ..:../../lib/java --no-fallback -O3 --pgo-instrument -march=native jvm.run) && ./java-native-image/jvm.run -XX:ProfilesDumpFile=java-native-image/run.iprof 10000 $(./check-output.sh -i) && (cd java-native-image ; native-image -cp ..:../../lib/java -O3 --pgo=run.iprof -march=native jvm.run -o run)"
+####### END The languages
diff --git a/docs/example-results-csv.png b/docs/example-results-csv.png
diff --git a/fibonacci/README.md b/fibonacci/README.md
@@ -1,5 +1,20 @@
 # Fibonacci
 
+This program should benchmark a function computing `fibonacci(n)` using naïve recursion.
+* The code is supposed to have early return for `n < 2` (the base cases).
+* For the non-base cases the code should do two recursive calls.
+* The code should be free of any hints to the compiler to memoize, use tail recursion,
+  iterative methods, or any avoidance of the naïve recursion.
+
+If some compiler finds ways to avoid recursive calls without any hints, than that is a result. We are in some sense testing compilers here, after all.
+
+Reference implementations:
+* Clojure: [run.clj](clojure/run.clj)
+* Java: [run.java](jvm/run.java)
+* C: [run.c](c/run.c)
+
+## Legacy
+
 This program computes the sum of the first N fibonacci numbers.
 Each fibonacci number is computed using a naive recursive solution.
 Submissions using faster tail-recursion or iterative solutions will not not be accepted.

diff --git a/fibonacci/bb/bb.edn b/fibonacci/bb/bb.edn
@@ -0,0 +1 @@
+{:deps {languages/tooling {:local/root "../../lib/clojure"}}}
diff --git a/fibonacci/bb/run.clj b/fibonacci/bb/run.clj
@@ -0,0 +1,13 @@
+(require '[languages.benchmark :as benchmark])
+
+(defn- fibonacci [n]
+  (if (< n 2)
+    n
+    (+ (fibonacci (- n 1))
+       (fibonacci (- n 2)))))
+
+(let [run-ms (parse-long (first *command-line-args*))
+      u (parse-long (second *command-line-args*))]
+  (-> (benchmark/run #(fibonacci u) run-ms)
+      benchmark/format-results
+      println))
diff --git a/fibonacci/c/run.c b/fibonacci/c/run.c
@@ -0,0 +1,34 @@
+/**
+ * @file
+ * @brief This file uses Google style formatting.
+ */
+
+#include "benchmark.h"
+#include "stdint.h"
+#include "stdio.h"
+#include "stdlib.h"
+
+int32_t fibonacci(int32_t n) {
+  if (n < 2) return n;
+  return fibonacci(n - 1) + fibonacci(n - 2);
+}
+
+// The work function that benchmark will time
+static benchmark_result_t work(void* data) {
+  int* n = (int*)data;
+  int r = fibonacci(*n);
+  benchmark_result_t result = {.value.number = r};
+  return result;
+}
+
+int main(int argc, char** argv) {
+  int run_ms = atoi(argv[1]);
+  int u = atoi(argv[2]);
+  // Warmup
+  benchmark_run(work, &u, run_ms);
+  // Actual benchmark
+  benchmark_stats_t stats = benchmark_run(work, &u, run_ms);
+  char buffer[1024];
+  benchmark_format_results(stats, buffer, sizeof(buffer));
+  printf("%s\n", buffer);
+}
diff --git a/fibonacci/check-output.sh b/fibonacci/check-output.sh
@@ -0,0 +1,28 @@
+#!/bin/bash
+
+input=37
+expected_result="24157817"
+echo_input=false
+
+while getopts "i" opt; do
+  case $opt in
+    i) echo_input=true ;;
+    *) ;;
+  esac
+done
+
+if [ "$echo_input" = true ]; then
+  echo "$input"
+  exit 0
+fi
+
+result=$(echo "$1" | sed 's/\x1b\[[0-9;]*m//g' | awk -F ',' '{print $6}')
+
+if [ "${result}" == "${expected_result}" ]; then
+  echo "Check passed"
+  exit 0
+else
+  echo "Incorrect result:"
+  echo "${result}"
+  exit 1
+fi
diff --git a/fibonacci/clojure-native-image/deps.edn b/fibonacci/clojure-native-image/deps.edn
@@ -1,13 +1,19 @@
 {:paths ["."]
- :deps {code/clojure {:local/root "../clojure"}} 
+ :deps {code/clojure {:local/root "../clojure"}
+        clj.native-image/clj.native-image
+        {:git/url "https://github.com/taylorwood/clj.native-image.git"
+         :sha "4604ae76855e09cdabc0a2ecc5a7de2cc5b775d6"}} 
  :aliases {:native-image
            {:main-opts ["-m" "clj.native-image" "code"
                         "-O3"
                         "--initialize-at-build-time"
                         "-H:+UnlockExperimentalVMOptions"
                         "-H:Name=code"]
-            :jvm-opts ["-Dclojure.compiler.direct-linking=true"]
-            :extra-deps
-            {clj.native-image/clj.native-image
-             {:git/url "https://github.com/taylorwood/clj.native-image.git"
-              :sha "4604ae76855e09cdabc0a2ecc5a7de2cc5b775d6"}}}}}
+            :jvm-opts ["-Dclojure.compiler.direct-linking=true"]}
+           :native-image-run
+           {:main-opts ["-m" "clj.native-image" "run"
+                        "-O3"
+                        "--initialize-at-build-time"
+                        "-H:+UnlockExperimentalVMOptions"
+                        "-H:Name=run"]
+            :jvm-opts ["-Dclojure.compiler.direct-linking=true"]}}}
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		{:deps {languages/tooling {:local/root "../../lib/clojure"}}}