Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -182,9 +182,9 @@ cython_debug/
.abstra/

# Visual Studio Code
# Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
# Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
# that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
# and can be added to the global gitignore or merged into this file. However, if you prefer,
# and can be added to the global gitignore or merged into this file. However, if you prefer,
# you could uncomment the following to ignore the entire vscode folder
# .vscode/

Expand All @@ -208,4 +208,4 @@ __marimo__/

claude.local.md
.idea
review_classification.db
review_classification.db
22 changes: 15 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.14.10
- repo: local
hooks:
- id: ruff
args: [ --fix ]
name: ruff
entry: uv run ruff check --fix --force-exclude
language: system
types: [python]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.19.1
name: ruff format
entry: uv run ruff format --force-exclude
language: system
types: [python]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: mypy
additional_dependencies: [typer, PyGithub, sqlmodel, tenacity, pytest]
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

A tool to look for PR outliers merged within a date range and identify which ones are outliers in terms of PR reviews review time and qualitative reviews.

This tool is an exercise in the use of [Antigravity](https://antigravity.google/) with the [gemini agent](https://gemini.google.com/app).
This tool is an exercise in the use of [Antigravity](https://antigravity.google/) with the [gemini agent](https://gemini.google.com/app).
12 changes: 6 additions & 6 deletions claude.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# PR outliers
# PR outliers

I want to look at a github repository and take a PR merged within a date range and identify which ones are outliers in terms of PR reviews review time and qualitative reviews.

Expand All @@ -8,27 +8,27 @@ I want to look at a github repository and take a PR merged within a date range a
- I want to use mypy as the static type checker
- I want to use github actions as my CI/CD pipeline
- I want to run ruff and mypy as part of the CI/CD pipeline as a pre-commit hook and for each PR raised
I want to be able to
I want to be able to
- classify PRs that were merged within a certain date range
- classify as outlier reviews
- cache the PR data to a local sqlite DB
- handle github rate limiting via a backoff and wait mechanism
- request multiple PR data concurrently
- I want to use claud code as my AI assistant.
- I want to use claud code as my AI assistant.
- I want to use MCP agents for my local github
- I want to use MCP agent for my locals sqlite repo


## Outlier Definition

- An outlier review is something that may need more attention.
- An outlier review is something that may need more attention.
- It could be a PR that is reviewed too quickly.
- It could be a PR that has a high number of changes.
- It could be a PR that has a high number of complexity.
- It could be a PR that has no comments.
- It could be a PR that has code changes but no unit tests.
- I would like to automatically identify outliers base on the criteria available from the PR review data in github.

## Things I need to do
## Things I need to do
- identify what features the PR has that I want to use for classification
- identify what classification modelling tools I would want to use
- identify what classification modelling tools I would want to use
Empty file.
119 changes: 119 additions & 0 deletions tests/queries/test_github_client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import unittest
from datetime import UTC, datetime
from unittest.mock import MagicMock, patch

from review_classification.queries.github_client import fetch_prs
from review_classification.sqlite.models import PullRequest


class TestGithubClient(unittest.TestCase):
@patch("review_classification.queries.github_client.Github")
@patch("review_classification.queries.github_client.fetch_repo")
def test_fetch_prs_success(self, mock_fetch_repo: MagicMock, _: MagicMock) -> None:
# Setup
mock_repo = MagicMock()
mock_fetch_repo.return_value = mock_repo

# Mock PRs
pr1 = MagicMock()
pr1.number = 1
pr1.title = "PR 1"
pr1.user.login = "user1"
pr1.created_at = datetime(2023, 1, 10, 12, 0, 0)
pr1.merged_at = datetime(2023, 1, 11, 12, 0, 0)
pr1.closed_at = datetime(2023, 1, 11, 12, 0, 0)
pr1.additions = 10
pr1.deletions = 5
pr1.changed_files = 2
pr1.comments = 1
pr1.review_comments = 0
pr1.state = "closed"
pr1.html_url = "http://github.com/owner/repo/pull/1"

# Setup the generator behavior
# fetch_prs calls fetch_prs_generator twice (one ignored, one used)
# We just ensure get_pulls returns our list
mock_repo.get_pulls.return_value = [pr1]

# Execute
results = fetch_prs("owner/repo", token="dummy")

# Verify
self.assertEqual(len(results), 1)
self.assertIsInstance(results[0], PullRequest)
self.assertEqual(results[0].number, 1)
self.assertEqual(results[0].title, "PR 1")
self.assertEqual(results[0].author, "user1")

@patch("review_classification.queries.github_client.Github")
@patch("review_classification.queries.github_client.fetch_repo")
def test_fetch_prs_date_filtering(
self, mock_fetch_repo: MagicMock, _: MagicMock
) -> None:
mock_repo = MagicMock()
mock_fetch_repo.return_value = mock_repo

# Dates (UTC)
date_target = datetime(2023, 6, 15, tzinfo=UTC)
date_before = datetime(2023, 1, 1, tzinfo=UTC)
date_after = datetime(2023, 12, 31, tzinfo=UTC)

# PRs
# 1. New (After end_date) - Should be skipped
pr_new = MagicMock()
pr_new.created_at = date_after
pr_new.number = 3

# 2. Target (In range) - Should be included
pr_mid = MagicMock()
pr_mid.created_at = date_target
pr_mid.number = 2
pr_mid.user.login = "user"
pr_mid.title = "Target"
pr_mid.merged_at = None
pr_mid.closed_at = None
pr_mid.additions = 0
pr_mid.deletions = 0
pr_mid.changed_files = 0
pr_mid.comments = 0
pr_mid.review_comments = 0
pr_mid.state = "open"
pr_mid.html_url = "url"

# 3. Old (Before start_date) - Should trigger break
pr_old = MagicMock()
pr_old.created_at = date_before
pr_old.number = 1

# The list is returned in desc order as requested
mock_repo.get_pulls.return_value = [pr_new, pr_mid, pr_old]

# Execute with date range
start_str = "2023-02-01"
end_str = "2023-10-01"

results = fetch_prs(
"owner/repo", start_date=start_str, end_date=end_str, token="dummy"
)

# Verify
self.assertEqual(len(results), 1)
self.assertEqual(results[0].number, 2)

@patch("review_classification.queries.github_client.Github")
@patch("review_classification.queries.github_client.fetch_repo")
def test_fetch_prs_no_token(self, _: MagicMock, mock_github: MagicMock) -> None:
# Test that it tries to grab env var if no token passed
with patch.dict("os.environ", {"GITHUB_TOKEN": "env_token"}):
fetch_prs("owner/repo")
mock_github.assert_called_with("env_token")

@patch("review_classification.queries.github_client.Github")
@patch("review_classification.queries.github_client.fetch_repo")
def test_fetch_prs_empty(self, mock_fetch_repo: MagicMock, _: MagicMock) -> None:
mock_repo = MagicMock()
mock_fetch_repo.return_value = mock_repo
mock_repo.get_pulls.return_value = []

results = fetch_prs("owner/repo", token="dummy")
self.assertEqual(len(results), 0)
Loading