feat: add repository analysis with automated PR creation #33

naaa760 · 2025-12-23T02:58:00Z

Summary

Implement repository analysis agent with one-click PR creation for Watchflow rules.

Key Changes

Rule Engine Integration: Generate valid YAML rules using correct parameters structure (removed invalid validators/actions sections)
Dynamic Analysis: Language-specific patterns and PR history analysis for intelligent rule recommendations
PR Creation Flow: Robust error handling and logging throughout branch creation, file commits, and PR submission
Frontend: Repository URL input and analysis flow integrated into landing page

- Add _analyze_pr_bad_habits method for PR pattern detection - Improve _get_language_specific_patterns for dynamic rule generation - Document rule engine limitations for combined restrictions - Update rule confidence based on repository analysis

gemini-code-assist · 2025-12-23T02:58:16Z

Summary of Changes

Hello @naaa760, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant new feature: an automated repository analysis agent capable of generating and proposing Watchflow rules via one-click PR creation. The core enhancements include a more intelligent rule recommendation system that leverages repository language features, historical PR patterns, and contributing guidelines to suggest highly relevant rules. Furthermore, the PR creation workflow has been made more robust with detailed error handling and comprehensive logging for GitHub API interactions, ensuring a smoother and more transparent user experience when automating rule deployment.

Highlights

Intelligent Rule Recommendations: The system now analyzes PR history for "bad habits" (e.g., missing tests, short descriptions) and uses this, along with language-specific patterns and contributing guidelines, to generate more relevant and confident Watchflow rule recommendations.
Robust PR Creation Flow: Enhanced error handling and logging have been implemented across the branch creation, file commit, and pull request submission processes, providing clearer feedback and debugging information for potential failures.
YAML Rule Structure Correction: The rule engine now correctly generates YAML rules with the "parameters" structure, removing invalid "validators" and "actions" sections, ensuring valid rule configurations.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new feature for repository analysis and automated PR creation with recommended rules. The changes include new agent logic, an API endpoint for the PR flow, and significant improvements to logging and error handling, which enhance debuggability.

My review has identified several critical issues, primarily related to incorrect indentation in the Python code that will cause runtime errors. There are also issues with malformed f-strings that will generate invalid YAML for the rules. Additionally, the core analysis function for PR history is currently a stub, which means the logic for dynamically adjusting rule confidence is not yet functional. I've provided suggestions to fix these issues and also a recommendation to make part of the new logic more robust.

gemini-code-assist · 2025-12-23T02:59:56Z

src/agents/repository_analysis_agent/nodes.py

    workflows = await github_client.list_directory_any_auth(
        repo_full_name=repo, path=".github/workflows", installation_id=installation_id
-    )
+        )


This line has an extra level of indentation, which will cause a Python IndentationError at runtime. The closing parenthesis should be aligned with the start of the line containing the opening parenthesis.

)

gemini-code-assist · 2025-12-23T02:59:57Z

src/agents/repository_analysis_agent/nodes.py

    installation_id = state.installation_id

-    content = await github_client.get_file_content(
+        content = await github_client.get_file_content(


This line has an extra level of indentation, which will cause a Python IndentationError at runtime.

content = await github_client.get_file_content(

gemini-code-assist · 2025-12-23T02:59:57Z

src/agents/repository_analysis_agent/nodes.py

                f"""
                description: "Require tests when code changes"
-                enabled: true
+enabled: true
                severity: medium
-                event_types:
-                  - pull_request
-                parameters:
+event_types:
+  - pull_request
+    parameters:
                  source_patterns:
 {chr(10).join(f'                    - "{pattern}"' for pattern in source_patterns)}
                  test_patterns:
 {chr(10).join(f'                    - "{pattern}"' for pattern in test_patterns)}
                """


The indentation within this f-string for generating the YAML rule is incorrect. Keys like enabled, event_types, and parameters are not properly indented, which will result in invalid YAML and cause parsing errors.

f""" description: "Require tests when code changes" enabled: true severity: medium event_types: - pull_request parameters: source_patterns: {chr(10).join(f' - "{pattern}"' for pattern in source_patterns)} test_patterns: {chr(10).join(f' - "{pattern}"' for pattern in test_patterns)} """

gemini-code-assist · 2025-12-23T02:59:57Z

src/agents/repository_analysis_agent/nodes.py

                """
                description: "Ensure PRs include context"
-                enabled: true
+enabled: true
                severity: low
-                event_types:
-                  - pull_request
-                parameters:
+event_types:
+  - pull_request
+    parameters:
                  min_description_length: 50
                """


Similar to the previous rule, the indentation within this f-string is incorrect and will produce invalid YAML.

""" description: "Ensure PRs include context" enabled: true severity: low event_types: - pull_request parameters: min_description_length: 50 """

gemini-code-assist · 2025-12-23T02:59:57Z

src/agents/repository_analysis_agent/nodes.py

 def validate_recommendations(state: RepositoryAnalysisState) -> None:
    """Ensure generated YAML is valid."""
-    for rec in state.recommendations:
+        for rec in state.recommendations:


This line has an extra level of indentation, which will cause a Python IndentationError at runtime.

Suggested change

for rec in state.recommendations:

for rec in state.recommendations:

gemini-code-assist · 2025-12-23T02:59:57Z

src/agents/repository_analysis_agent/nodes.py

+def _analyze_pr_bad_habits(state: RepositoryAnalysisState) -> dict[str, Any]:
+    """
+    Analyze PR history to detect bad habits and patterns.
+
+    Returns a dict with detected issues like:
+    - missing_tests: PRs without test files
+    - short_descriptions: PRs with very short descriptions
+    - no_reviews: PRs merged without reviews
+    """
+    if not state.pr_samples:
+        return {}
+
+    issues: dict[str, Any] = {
+        "missing_tests": 0,
+        "short_descriptions": 0,
+        "no_reviews": 0,
+        "total_analyzed": len(state.pr_samples),
+    }
+
+    # Note: We can't analyze PR diffs/descriptions from the basic PR list API
+    # This would require fetching individual PR details which is expensive.
+    # For now, we return basic stats that can inform recommendations.
+
+    return issues


The new function _analyze_pr_bad_habits is currently a stub that returns zero for all issue counts. This means the new logic in _default_recommendations that adjusts confidence scores based on these counts will never be triggered, rendering it dead code for now.

Furthermore, the docstring mentions detecting short_descriptions, but the PullRequestSample model does not contain a description field, only a title. This should be clarified or corrected.

Consider implementing at least a basic analysis (e.g., checking for short PR titles) to make this feature functional.

gemini-code-assist · 2025-12-23T02:59:57Z

src/agents/repository_analysis_agent/nodes.py

+        for rec in recommendations:
+            if "tests" in rec.yaml_rule.lower():
+                rec.confidence = min(0.95, rec.confidence + 0.1)
+                rec.reasoning += " Contributing guidelines explicitly require tests."


The current logic for finding the test-related rule recommendation relies on a simple substring search ("tests" in rec.yaml_rule.lower()). This is brittle and could lead to incorrectly modifying another rule if its content happens to include the word "tests". A more robust approach would be to identify the rule by a more specific characteristic, such as its description, and to stop searching once the rule is found.

for rec in recommendations: if 'description: "Require tests when code changes"' in rec.yaml_rule: rec.confidence = min(0.95, rec.confidence + 0.1) rec.reasoning += " Contributing guidelines explicitly require tests." break

codecov-commenter · 2025-12-23T18:47:43Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 6.52174% with 43 lines in your changes missing coverage. Please review.

❌ Your patch status has failed because the patch coverage (6.5%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (32.9%) is below the target coverage (80.0%). You can increase the head coverage or adjust the target coverage.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@           Coverage Diff           @@
##            main     #33     +/-   ##
=======================================
- Coverage   33.2%   32.9%   -0.3%     
=======================================
  Files         85      85             
  Lines       5082    5122     +40     
=======================================
+ Hits        1688    1689      +1     
- Misses      3394    3433     +39

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c37641e...513b5bd. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Fix YAML indentation in rule generation (proper 4-space indentation for list items) - Implement _analyze_pr_bad_habits to actually analyze PR samples: * Detect short titles (< 10 chars) as indicator of missing context * Estimate missing tests based on changed_files and title analysis - Update docstring to mention 'title' instead of 'description' - Update references from 'short_descriptions' to 'short_titles'

dkargatzis · 2025-12-24T12:15:51Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a valuable new feature for repository analysis and automated PR creation with recommended rules. The implementation is solid, with intelligent rule recommendations based on repository language, PR history, and contributing guidelines. I'm particularly impressed with the significant improvements to logging and error handling throughout the PR creation flow, which will greatly enhance debuggability and provide clearer feedback to users. The code is well-structured and the changes are easy to follow. I have one minor suggestion to improve type hinting for better code clarity and maintainability. Overall, this is an excellent contribution.

gemini-code-assist · 2025-12-24T12:16:58Z

src/agents/repository_analysis_agent/nodes.py

+def _analyze_pr_bad_habits(state: RepositoryAnalysisState) -> dict[str, Any]:
+    """
+    Analyze PR history to detect bad habits and patterns.
+
+    Returns a dict with detected issues like:
+    - missing_tests: PRs without test files (estimated based on changed_files)
+    - short_titles: PRs with very short titles (< 10 characters)
+    - no_reviews: PRs merged without reviews (always 0, as we can't determine this from list API)
+
+    Note: We can't analyze PR diffs/descriptions from the basic PR list API.
+    This would require fetching individual PR details which is expensive.
+    We analyze what we can from the PR list metadata.
+    """
+    if not state.pr_samples:
+        return {}
+
+    issues: dict[str, Any] = {
+        "missing_tests": 0,
+        "short_titles": 0,
+        "no_reviews": 0,
+        "total_analyzed": len(state.pr_samples),
+    }


The type hint dict[str, Any] is used for both the function's return value and the issues variable. Since all values in the dictionary are integers, it would be more precise to use dict[str, int]. This improves type safety and makes the code easier to understand for future maintainers.

def _analyze_pr_bad_habits(state: RepositoryAnalysisState) -> dict[str, int]: """ Analyze PR history to detect bad habits and patterns. Returns a dict with detected issues like: - missing_tests: PRs without test files (estimated based on changed_files) - short_titles: PRs with very short titles (< 10 characters) - no_reviews: PRs merged without reviews (always 0, as we can't determine this from list API) Note: We can't analyze PR diffs/descriptions from the basic PR list API. This would require fetching individual PR details which is expensive. We analyze what we can from the PR list metadata. """ if not state.pr_samples: return {} issues: dict[str, int] = { "missing_tests": 0, "short_titles": 0, "no_reviews": 0, "total_analyzed": len(state.pr_samples), }

naaa760 added 2 commits December 20, 2025 09:48

Fix rule YAML format and improve PR creation error handling

abb38ae

naaa760 requested a review from dkargatzis as a code owner December 23, 2025 02:58

gemini-code-assist bot reviewed Dec 23, 2025

View reviewed changes

fix: correct indentation issues in nodes.py and test mock return type

b06500b

naaa760 added 5 commits December 24, 2025 00:53

fix: format nodes.py with black

24c9b3a

fix: format nodes.py with ruff

4a3bdd4

fix: format api.py with ruff

8932233

fix: use content check instead of has_contributing attribute

513b5bd

gemini-code-assist bot reviewed Dec 24, 2025

View reviewed changes

dkargatzis approved these changes Dec 24, 2025

View reviewed changes

dkargatzis merged commit ed6e38f into warestack:main Dec 24, 2025
2 checks passed

	for rec in state.recommendations:
	for rec in state.recommendations:

feat: add repository analysis with automated PR creation #33

feat: add repository analysis with automated PR creation #33

Conversation

naaa760 commented Dec 23, 2025

Summary

Key Changes

Uh oh!

gemini-code-assist bot commented Dec 23, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dkargatzis commented Dec 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Dec 23, 2025 •

edited

Loading