feat: Add z-score based outlier detection for PR reviews #14

ghinks · 2026-01-12T13:04:43Z

Summary

Implements statistical outlier detection to identify unusual PR reviews using z-scores calculated per repository on both raw metrics and engineered features.

This PR adds comprehensive outlier detection capabilities to analyze PR review quality and identify potential issues such as:

Rushed reviews (short review duration)
Oversized PRs (excessive code changes)
Under-reviewed code (low comment density)
Statistical anomalies across multiple dimensions

Key Features

1. Feature Engineering

Review duration: Time from PR creation to merge (detects rushed reviews)
Code churn: Total lines changed (additions + deletions)
Comment density: Comments per file and per line changed
Automatic computation and caching for all PRs

2. Statistical Analysis

Per-repository mean and standard deviation calculation
Z-score computation for 9 features (5 raw + 4 engineered)
Configurable threshold (default: |z-score| > 2 for 95% confidence)
Minimum sample size requirement (default: 30 merged PRs)

3. Outlier Detection

Identifies PRs that are statistical outliers
Tracks which specific features triggered outlier status
Stores z-scores for all features in database
Provides maximum absolute z-score for prioritization

4. New CLI Command

review-classify detect-outliers REPOSITORY [OPTIONS]

Options:
  --threshold, -t FLOAT    Z-score threshold (default: 2.0)
  --min-samples INT        Minimum PRs required (default: 30)
  --format, -f TEXT        Output: table, json, csv (default: table)
  --verbose, -v            Enable verbose output

5. Multiple Output Formats

Table: Human-readable ASCII table with PR numbers and outlier features
JSON: Structured data with all z-scores for programmatic analysis
CSV: Simple export format for spreadsheets

Database Schema

New Tables

prfeatures: Stores computed features for each PR
proutlierscore: Stores z-scores and outlier flags

Testing

✅ 33 unit and integration tests, all passing
✅ Type checking (mypy --strict): passing
✅ Linting (ruff): passing
✅ Tested with real expressjs/express data

Example Usage

# Detect outliers with default settings
review-classify detect-outliers expressjs/express --min-samples 5 --verbose

# Output:
# Analyzing outliers for expressjs/express...
# Computing features...
# Computed features for 44 PRs
# Detecting outliers...
# Found 2 outliers out of 7 PRs (28.6%)
#
# Outlier Pull Requests
# ====================================================================================================
# PR #       Max |Z|      Outlier Features
# ----------------------------------------------------------------------------------------------------
# #6236      2.27         additions, deletions, changed_files, code_churn
# #6211      2.10         comments, comment_density_per_file
# ----------------------------------------------------------------------------------------------------
# Total outliers: 2 out of 7 PRs (28.6%)

Technical Implementation

Modules Added

src/review_classification/features/engineering.py: Feature computation
src/review_classification/analysis/statistics.py: Statistical functions
src/review_classification/analysis/outlier_detector.py: Detection logic
src/review_classification/cli/output.py: Result formatting

Database Functions

save_pr_features(): Upsert computed features
get_pr_features(): Retrieve features for a PR
get_outlier_scores(): Query outlier results

Verification

After merging, users can:

Run outlier detection on any repository with sufficient data
Query results via MCP JDBC tools for analysis
Export results in JSON/CSV for further processing
Adjust threshold for different sensitivity levels

🤖 Generated with Claude Code

Implement statistical outlier detection to identify unusual PR reviews using z-scores calculated per repository on both raw metrics and engineered features. Key components: - Feature engineering: review_duration, code_churn, comment_density - Statistical analysis: per-repository mean/std calculation, z-score computation - Outlier detection: flags PRs when |z-score| > 2 (configurable threshold) - CLI command: detect-outliers with table/json/csv output formats - Comprehensive test suite: 33 unit and integration tests, all passing This enables data-driven identification of rushed reviews, oversized PRs, under-reviewed code, and other review quality issues. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Added type ignore comment for max_abs_z_score ordering and improved type annotations in test fixtures for better type safety. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

ghinks

initial review good

ghinks and others added 3 commits January 12, 2026 07:49

docs: add z-score calculation to outlier detection requirements

89463fa

fix: add type annotations and resolve mypy type checking issues

c80887d

Added type ignore comment for max_abs_z_score ordering and improved type annotations in test fixtures for better type safety. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

ghinks commented Jan 12, 2026

View reviewed changes

ghinks merged commit b0608a7 into main Jan 13, 2026
1 check passed

ghinks deleted the claud/z-score-calculator branch January 13, 2026 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add z-score based outlier detection for PR reviews #14

feat: Add z-score based outlier detection for PR reviews #14

Uh oh!

ghinks commented Jan 12, 2026

Uh oh!

ghinks left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add z-score based outlier detection for PR reviews #14

feat: Add z-score based outlier detection for PR reviews #14

Uh oh!

Conversation

ghinks commented Jan 12, 2026

Summary

Key Features

1. Feature Engineering

2. Statistical Analysis

3. Outlier Detection

4. New CLI Command

5. Multiple Output Formats

Database Schema

New Tables

Testing

Example Usage

Technical Implementation

Modules Added

Database Functions

Verification

Uh oh!

ghinks left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants