Allow users to access actual outcome of tests

Hey, 

this is an issue I have encountered using datajudge within some projects and I'd like to discuss whether this is a general use case.

### Status Quo
Currently, the `test()` method of a `Constraint` returns a `TestResult` object. 
Example: [`NumericMean`](https://github.com/Quantco/datajudge/blob/7b992562f10ab72a3e10a27a6c5b02b9ab508ce9/src/datajudge/constraints/numeric.py#L137)
```py
def test(self, engine: sa.engine.Engine) -> TestResult:
    # retrieve values ...
    result = deviation <= self.max_absolute_deviation
    return TestResult(result, assertion_text)
```
For our [pytest integration](https://github.com/Quantco/datajudge/blob/7b992562f10ab72a3e10a27a6c5b02b9ab508ce9/src/datajudge/pytest_integration.py#L21), we assert the outcome of this TestResult
```py
@pytest.mark.parametrize(
    "constraint", all_constraints, ids=Constraint.get_description
)
def test_constraint(constraint, datajudge_engine):
    # ...
    test_result = constraint.test(datajudge_engine)
    assert test_result.outcome, test_result.failure_message
```
This is completely fine since the goal of a test is to check whether something meets a certain requirement.
In the case of a constraint _failing_ the user gets feedback on what went wrong in the form of the `TestResult.failure_message` field.

### Missing feature
In general, though, `datajudge` might not only be used in combination with `pytest` to evaluate if something meets a certain requirement but also to validate the results by plotting, comparing, or evaluating them.

One concrete example is the use of statistical requirements/constraints, such as the [KolmogorovSmirnov2Sample](https://github.com/Quantco/datajudge/blob/7b992562f10ab72a3e10a27a6c5b02b9ab508ce9/src/datajudge/constraints/stats.py#L12) constraint where from a user perspective it might not only be important to validate whether the test result is significant but also to report back _how_ significant it is.
A user might display these values in a dashboard tracking database changes over time, or use them to initially set some values for constraint specification.

_Example_: Measure how many values are currently missing in a data dump and then specify the constraint for the future based on that value.

This would, additionally, allow the user to inspect the outcomes of tests that were successful and answer the question "Okay, but how close are we to the threshold"?

### Needed changes
Since the project architecture is pretty clean (🚀 ), implementing this feature would just require adding a new field to the `TestResult` object, e.g. `values` which allows the user to access the internally measured value for a certain constraint.

The change would be fully backward-compatible and could initially be just integrated for constraints where it makes sense the most, e.g. statistical constraints. 
We could then leave a note in the documentation, that the results are available by accessing the field with a certain name.

For statistical constraints, this field would contain the test statistic and/or the p-value. For numeric constraints, it would contain the actually measured mean, number of rows, missing columns, etc.

### Example Usage
This could result in the following code on the user side.
```py
results = []
for year in range(2019, 2023):
  req = BetweenRequirement.from_tables(year, year+1) # simplified
  req.add_statistical_test_constraint(...)
  req.add_n_rows_max_gain_constraint(0.1)
  stat_result = req[0].test(engine)
  gain_result = req[1].test(engine)
  
   # Users can track the change in data
  results.append((year, stat_result.values["p_value"], gain_result.values["n_rows_factual"]))
```
---
It might seem like a niche use case at first, but I think it's an important part of a framework geared toward data validation. :)
Happy to hear what you think!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow users to access actual outcome of tests #69

Status Quo

Missing feature

Needed changes

Example Usage

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow users to access actual outcome of tests #69

Description

Status Quo

Missing feature

Needed changes

Example Usage

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions