feat: use histogram samples for t-test analysis by jdmarshall · Pull Request #155 · RafaelGSS/bench-node

jdmarshall · 2026-01-05T02:39:53Z

Justification:

Each sample in the histogram represents a durationPerOp value calculated by dividing a certain number of iterations of executing the function under test divided by the cumulative time of those runs. Which is the average of the execution time of each execution. opsSec and opsSecPerRun are then an average of the samples, which are themselves averages.

Therefore, using opsSecPerRun as a t-test inaccurately applying the calculation to an average of averages, when it is meant to be applied to a set of averages totalling a minimum of 30 samples, with 40 preferable.

In other words, it's a histogram entry that represents a valid t-test sample.

When repeatSuite > 1, the additional samples accumulate in the histogram.

This change also converts the forced override of the input options to a warning if the sample size is too small. Let the users pick whether they want minSamples: 30 or repeatSuite: 3. The code already had support for omitting the significance data if the low bar is not met.

Justification: Each sample in the histogram represents a durationPerOp sample calculated by dividing a certain number of iterations of executing the function under test divided by the cumulative time of those runs. Which is the average of the execution time of each execution. opsSec and opsSecPerRun are then an average of the samples, which are themselves averages. Therefore, using opsSecPerRun as a t-test inaccurately applying the calculation to an average of averages, when it is meant to be applied to a set of averages totalling a minimum of 30 samples, with 40 preferable. In other words, it's a histogram entry that represents a valid t-test sample.

…s too small. This will help me sort out inconclusive tests without missing misconfigured ones. This is necessitated by the changes in the previous commit that allow for failure instead of forcing success.

RafaelGSS · 2026-01-20T21:45:30Z

examples/.DS_Store

please remove it

RafaelGSS · 2026-01-20T21:55:12Z

examples/statistical-significance/README.md

 const suite = new Suite({
-  ttest: true,  // Automatically sets repeatSuite=30
+  ttest: true,
+  minSamples: 30, // minSamples x repeatSuite must be > 30


RafaelGSS · 2026-01-20T21:55:54Z

README.md


-Enable t-test mode with `ttest: true`. This automatically sets `repeatSuite=30` to collect enough
-independent samples for reliable statistical analysis (per the Central Limit Theorem):
+Enable t-test mode with `ttest: true`. Requires 30 independent samples for reliable statistical analysis (per the 


This shouldn't be the case. It should be 30 full suites (regardless of samples).

You have asserted several times now that suites are samples and samples are not samples without expanding on why, aside from restating your assertion in essentially the same words.

Why are samples not samples?

I've been over that code and the way in which count and time are calculated is consistent with the notion of 'sample' as I've seen it described in the literature. Where is my error?

By taking a suite as a sample taken over several seconds, you're taking an average of an average, which makes the results of the t-test less accurate.

jdmarshall mentioned this pull request Jan 5, 2026

t-test is too slow to be enabled by default cobblers-children/faceoff#19

Open

jdmarshall force-pushed the t-test branch from 7a61f90 to 069cb55 Compare January 5, 2026 03:04

jdmarshall force-pushed the t-test branch from 069cb55 to d3f4f7b Compare January 5, 2026 03:31

feat: report significant: false when t-test == true but sample size i…

a0b4160

…s too small. This will help me sort out inconclusive tests without missing misconfigured ones. This is necessitated by the changes in the previous commit that allow for failure instead of forcing success.

RafaelGSS reviewed Jan 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: use histogram samples for t-test analysis#155

feat: use histogram samples for t-test analysis#155
jdmarshall wants to merge 2 commits intoRafaelGSS:mainfrom
jdmarshall:t-test

jdmarshall commented Jan 5, 2026

Uh oh!

RafaelGSS Jan 20, 2026

Uh oh!

RafaelGSS Jan 20, 2026

Uh oh!

RafaelGSS Jan 20, 2026

Uh oh!

jdmarshall Jan 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jdmarshall commented Jan 5, 2026

Uh oh!

RafaelGSS Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

RafaelGSS Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

RafaelGSS Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

jdmarshall Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jdmarshall Jan 20, 2026 •

edited

Loading