feat(jira): complete Jira integration with multi-source support #3

amargiovanni · 2025-11-28T17:59:49Z

Summary

Add complete Jira integration alongside existing GitHub analysis
Support multi-source extraction with auto-detection of available credentials
Implement new Jira Cloud API /search/jql endpoint (migrated from deprecated /search)
Auto-select all accessible Jira projects when jira_projects.txt is missing
Export Jira issues and comments to CSV format

Features

Multi-source CLI: --sources auto|github|jira|github,jira
Jira Client: Full API v3 support with cursor-based pagination
JQL Safety: Project keys are quoted to handle reserved words (AS, IN, OR, etc.)
Auto-detection: Credentials checked from environment variables
CSV Export: jira_issues.csv and jira_comments.csv

Environment Variables

# GitHub (existing)
GITHUB_TOKEN=xxx

# Jira (new)
JIRA_URL=https://company.atlassian.net
JIRA_EMAIL=user@company.com
JIRA_API_TOKEN=xxx
Test plan
 All 585 tests passing (207 Jira-specific)
 Coverage at 94%
 Tested with real Jira Cloud instance (27 projects, 116 issues)
 Verified API migration from deprecated /search to /search/jql
 Confirmed JQL reserved word handling (project "AS")

Adds feature 002-jira-integration with full SpecKit artifacts: - spec.md: 4 user stories (P1-P4), 22 functional requirements, 7 success criteria, edge cases, clarifications - plan.md: Technical context, constitution check, project structure - research.md: Jira API research (auth, pagination, rate limiting) - data-model.md: JiraConfig, JiraIssue, JiraComment entities - contracts/: Jira REST API and module interface contracts - quickstart.md: Setup and usage guide - tasks.md: 61 tasks organized by user story with TDD approach - checklists/: Comprehensive requirements quality checklist (80 items) Key features: - Jira Cloud (API v3) and Server/Data Center (API v2) support - Secure credential handling via environment variables - Multi-platform CLI with --sources flag - CSV export for issues and comments (RFC 4180) - Backward compatibility wrapper for github_analyzer.py - Interactive project selection when jira_projects.txt missing

Add full Jira support with multi-source CLI: - JiraClient with pagination, rate limiting, and retry logic - Support for Jira Cloud (API v3) and Server/Data Center (API v2) - JiraExporter for CSV export (issues + comments) - JiraIssueAnalyzer for project summaries - Multi-source CLI with --sources flag (auto, github, jira, github,jira) - Auto-detection of available sources from environment credentials - Interactive Jira project selection when jira_projects.txt is missing - dev_analyzer.py as new primary entrypoint - Secure token handling (never exposed in logs/errors) - ADF (Atlassian Document Format) to plain text conversion New files: - src/github_analyzer/api/jira_client.py - src/github_analyzer/exporters/jira_exporter.py - src/github_analyzer/analyzers/jira_issues.py - dev_analyzer.py - jira_projects.txt.example Tests: 512 passed, 90% coverage

- Add comprehensive tests for Jira client (urllib/requests paths) - Add tests for GitHub client requests library path - Add validation tests for Jira URL, project keys, ISO8601 dates - Add troubleshooting section for Jira in README - Update coverage requirement from 95% to 90% in docs

- Change ASCII banner from "GITHUB ANALYZER" to "DEV ANALYZER" - Update description to mention GitHub and Jira - Add Jira Integration to features list

- Migrate from deprecated /rest/api/X/search to /rest/api/3/search/jql (Atlassian removed the old endpoint, returning 410 Gone) - Use cursor-based pagination (nextPageToken/isLast) instead of offset-based - Quote project keys in JQL to handle reserved words (e.g., "AS", "IN") - Auto-select all Jira projects when jira_projects.txt is missing/empty - Integrate Jira extraction into main() CLI flow with auto-detection - Update test fixtures for new API response format

gemini-code-assist · 2025-11-28T18:01:27Z

Summary of Changes

Hello @amargiovanni, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the github_analyzer tool by transforming it into a multi-platform DevAnalyzer. It introduces comprehensive integration with Jira, allowing users to extract and analyze issues and comments alongside existing GitHub data. This change provides a unified view of development metrics, supporting both platforms through a new, flexible command-line interface and robust API handling, ensuring secure credential management and efficient data export.

Highlights

Jira Integration: Full integration with Jira Cloud and Server/Data Center for issue and comment extraction, enabling comprehensive analysis alongside GitHub data.
Multi-Source Support: The tool now supports analyzing data from both GitHub and Jira, with automatic credential detection and a flexible command-line interface to select sources.
New Entry Point: Introduced dev_analyzer.py as the primary entry point for the multi-platform analyzer, with github_analyzer.py serving as a backward-compatible wrapper.
Jira API Modernization: Migrated to the /search/jql endpoint for Jira Cloud API, ensuring compatibility and efficiency, and handling JQL reserved words.
Flexible Project Selection: Automatically selects all accessible Jira projects if jira_projects.txt is missing or empty, or allows interactive selection by the user.
CSV Export: Exports Jira issues and comments to dedicated CSV files (jira_issues_export.csv and jira_comments_export.csv), consistent with existing GitHub export formats.
Test Coverage Adjustment: The minimum required test coverage threshold has been adjusted from 95% to 90% to accommodate the expanded codebase.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant and well-structured feature: Jira integration with multi-source support. The changes are extensive, touching everything from the CLI entry point and configuration to adding new API clients, analyzers, and exporters. The inclusion of comprehensive documentation, specifications, and tests is commendable. I've identified a critical compatibility issue with Jira Server, some code duplication, and a few areas for improvement in terms of consistency and adherence to the project's own specifications. Overall, this is a solid foundation for multi-source analysis.

gemini-code-assist · 2025-11-28T18:03:19Z

src/github_analyzer/api/jira_client.py

+    def search_issues(
+        self,
+        project_keys: list[str],
+        since_date: datetime,
+    ) -> Iterator[JiraIssue]:
+        """Search issues updated since given date.
+
+        Args:
+            project_keys: List of project keys to search.
+            since_date: Only return issues updated after this date.
+
+        Yields:
+            JiraIssue objects matching the criteria.
+
+        Raises:
+            JiraAPIError: If API request fails.
+        """
+        if not project_keys:
+            return
+
+        # Build JQL query (FR-005)
+        # Quote project keys to handle reserved JQL words (e.g., "AS", "IN", "OR")
+        quoted_keys = [f'"{key}"' for key in project_keys]
+        projects_jql = ", ".join(quoted_keys)
+        date_str = since_date.strftime("%Y-%m-%d")
+        jql = f"project in ({projects_jql}) AND updated >= '{date_str}' ORDER BY updated DESC"
+
+        # Use new /search/jql endpoint with cursor-based pagination
+        # See: https://developer.atlassian.com/changelog/#CHANGE-2046
+        max_results = 100
+        next_page_token: str | None = None
+
+        while True:
+            params: dict[str, Any] = {
+                "jql": jql,
+                "maxResults": max_results,
+                "fields": "*all,-comment",  # All fields except comments (fetched separately)
+            }
+
+            if next_page_token:
+                params["nextPageToken"] = next_page_token
+
+            response = self._make_request(
+                "GET",
+                f"/rest/api/{self.api_version}/search/jql",
+                params=params,
+            )
+
+            issues = response.get("issues", [])
+
+            for issue_data in issues:
+                yield self._parse_issue(issue_data)
+
+            # Check if more pages (cursor-based pagination)
+            if response.get("isLast", True) or not issues:
+                break
+
+            next_page_token = response.get("nextPageToken")


The current implementation of search_issues uses the GET /rest/api/{version}/search/jql endpoint with cursor-based pagination (nextPageToken). This is the modern approach for Jira Cloud (API v3), but it is not compatible with many Jira Server instances (API v2). Jira Server typically uses the older POST /rest/api/2/search endpoint with offset-based pagination (startAt, maxResults, total).

To ensure compatibility with both Jira Cloud and Server as intended, the implementation should check self.api_version and branch its logic to use the correct endpoint, HTTP method, and pagination mechanism for the target environment.

gemini-code-assist · 2025-11-28T18:03:19Z

src/github_analyzer/cli/main.py

+def run_extraction(
+    sources: list[DataSource],
+    output_dir: str,
+    days: int,
+    repos_file: str | None = None,
+    jira_projects_file: str | None = None,
+    verbose: bool = True,
+    fetch_pr_details: bool = False,
+) -> dict:
+    """Run extraction for specified sources.
+
+    Args:
+        sources: List of data sources to extract from.
+        output_dir: Directory for output CSV files.
+        days: Number of days to analyze.
+        repos_file: Path to repos.txt for GitHub.
+        jira_projects_file: Path to jira_projects.txt for Jira.
+        verbose: Enable verbose output.
+        fetch_pr_details: Fetch full PR details for GitHub.
+
+    Returns:
+        Dictionary with extraction results.
+    """
+    results = {"github": None, "jira": None}
+
+    if DataSource.GITHUB in sources:
+        # Run GitHub extraction
+        config = AnalyzerConfig.from_env()
+        if repos_file:
+            config.repos_file = repos_file
+        config.output_dir = output_dir
+        config.days = days
+        config.verbose = verbose
+        config.validate()
+
+        repositories = load_repositories(config.repos_file)
+
+        analyzer = GitHubAnalyzer(config, fetch_pr_details=fetch_pr_details)
+        try:
+            analyzer.run(repositories)
+            results["github"] = {"status": "success", "repos": len(repositories)}
+        finally:
+            analyzer.close()
+
+    if DataSource.JIRA in sources:
+        # Run Jira extraction
+        from src.github_analyzer.api.jira_client import JiraClient
+
+        jira_config = JiraConfig.from_env()
+        if not jira_config:
+            raise ConfigurationError("Jira credentials not configured")
+
+        # Get Jira projects
+        projects_file = jira_projects_file or jira_config.jira_projects_file
+        project_keys = select_jira_projects(projects_file, jira_config)
+
+        if not project_keys:
+            print("No Jira projects selected. Skipping Jira extraction.")
+        else:
+            client = JiraClient(jira_config)
+            since = datetime.now(timezone.utc) - timedelta(days=days)
+
+            # Collect issues and comments
+            all_issues = list(client.search_issues(project_keys, since))
+            all_comments = []
+            for issue in all_issues:
+                comments = client.get_comments(issue.key)
+                all_comments.extend(comments)
+
+            # Export Jira data to CSV
+            jira_exporter = JiraExporter(output_dir)
+            issues_file = jira_exporter.export_issues(all_issues)
+            comments_file = jira_exporter.export_comments(all_comments)
+
+            results["jira"] = {
+                "status": "success",
+                "issues": len(all_issues),
+                "comments": len(all_comments),
+                "files": [str(issues_file), str(comments_file)],
+            }
+
+    return results


This run_extraction function appears to be dead code, as it is never called within the application. The logic it contains is duplicated inside the main function (from line 690 onwards). This duplication increases maintenance overhead and can lead to inconsistencies.

The logic should be consolidated. I recommend removing this function and refactoring the duplicated logic from main into a new, single function that is called by main to handle the extraction orchestration.

gemini-code-assist · 2025-11-28T18:03:19Z

src/github_analyzer/cli/main.py

+def select_jira_projects(
+    projects_file: str,
+    jira_config: JiraConfig | None,
+) -> list[str]:
+    """Select Jira projects from file or use all available.
+
+    Args:
+        projects_file: Path to jira_projects.txt file.
+        jira_config: Jira configuration (required to fetch available projects).
+
+    Returns:
+        List of project keys to analyze.
+    """
+    # Try loading from file first
+    file_projects = load_jira_projects(projects_file)
+    if file_projects:
+        return file_projects
+
+    # No file or empty - use all available projects
+    if not jira_config:
+        return []
+
+    # Fetch available projects from Jira
+    from src.github_analyzer.api.jira_client import JiraClient
+
+    client = JiraClient(jira_config)
+    available_projects = client.get_projects()
+
+    if not available_projects:
+        print("No projects found in Jira instance.")
+        return []
+
+    # Use all available projects
+    all_keys = [p.key for p in available_projects]
+    print(f"\nNo {projects_file} found. Using all {len(all_keys)} available Jira projects.")
+
+    return all_keys
+


The behavior of this function when jira_projects.txt is missing or empty does not fully align with the specification FR-009a. The spec requires prompting the user to choose between analyzing all accessible projects or specifying them manually. The current implementation automatically defaults to using all available projects without user interaction.

To adhere to the specification, this function should be updated to include an interactive prompt for the user.

gemini-code-assist · 2025-11-28T18:03:19Z

src/github_analyzer/cli/main.py

+    if not available_projects:
+        print("No projects found in Jira instance.")
+        return []
+
+    # Use all available projects
+    all_keys = [p.key for p in available_projects]
+    print(f"\nNo {projects_file} found. Using all {len(all_keys)} available Jira projects.")


This function uses direct print() calls to provide feedback to the user. This is inconsistent with the rest of the application, which uses the TerminalOutput class for structured and formatted logging. Using the TerminalOutput instance would ensure all CLI messages have a consistent look and feel.

Consider passing the output object to this function and replacing the print() calls with output.log() or output.info().

gemini-code-assist · 2025-11-28T18:03:19Z

src/github_analyzer/config/validation.py

+def validate_iso8601_date(date_str: str) -> bool:
+    """Validate ISO 8601 date format.
+
+    Validates that the string is a valid ISO 8601 date (FR-021).
+    Supports both date-only and datetime formats.
+
+    Args:
+        date_str: The date string to validate.
+
+    Returns:
+        True if date is valid ISO 8601 format, False otherwise.
+
+    Examples:
+        >>> validate_iso8601_date("2025-11-28")
+        True
+        >>> validate_iso8601_date("2025-11-28T10:30:00Z")
+        True
+        >>> validate_iso8601_date("2025-11-28T10:30:00+00:00")
+        True
+        >>> validate_iso8601_date("28-11-2025")  # wrong format
+        False
+        >>> validate_iso8601_date("invalid")
+        False
+    """
+    if not date_str:
+        return False
+
+    # ISO 8601 date patterns
+    # Date only: YYYY-MM-DD
+    # Datetime with Z: YYYY-MM-DDTHH:MM:SSZ
+    # Datetime with offset: YYYY-MM-DDTHH:MM:SS+HH:MM
+    patterns = [
+        r"^\d{4}-\d{2}-\d{2}$",  # Date only
+        r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z$",  # Datetime with Z
+        r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[+-]\d{2}:\d{2}$",  # Datetime with offset
+        r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+Z$",  # Datetime with milliseconds and Z
+        r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+[+-]\d{2}:\d{2}$",  # With ms and offset
+    ]
+
+    if not any(re.match(pattern, date_str) for pattern in patterns):
+        return False
+
+    # Additional validation: check that date components are valid
+    try:
+        # Extract date part
+        date_part = date_str[:10]
+        year, month, day = map(int, date_part.split("-"))
+
+        # Basic range checks
+        if not (1 <= month <= 12):
+            return False
+        if not (1 <= day <= 31):
+            return False
+        if year < 1900 or year > 2100:
+            return False
+
+        return True
+    except (ValueError, IndexError):
+        return False


The function validate_iso8601_date is defined here but does not appear to be used anywhere in the codebase. Unused code should be removed to improve maintainability and reduce clutter. If this is intended for future use, it would be better to add it when it's actually needed.

gemini-code-assist · 2025-11-28T18:03:19Z

src/github_analyzer/api/jira_client.py

+    def _parse_datetime(self, value: str | None) -> datetime | None:
+        """Parse Jira datetime string to datetime object.
+
+        Args:
+            value: Jira datetime string (e.g., "2025-11-28T10:30:00.000+0000").
+
+        Returns:
+            Parsed datetime in UTC, or None if value is empty/None.
+        """
+        if not value:
+            return None
+
+        # Jira format: "2025-11-28T10:30:00.000+0000"
+        try:
+            # Remove milliseconds and fix timezone format
+            if "." in value:
+                value = value.split(".")[0] + value[-5:]
+
+            # Handle +0000 format (no colon)
+            if value[-5:].replace("-", "+")[0] in "+-" and ":" not in value[-5:]:
+                value = value[:-2] + ":" + value[-2:]
+
+            return datetime.fromisoformat(value.replace("Z", "+00:00"))
+        except (ValueError, IndexError):
+            return None


The logic in this function for parsing datetime strings is quite complex and can be difficult to follow. While it appears to handle the expected Jira formats, adding comments to explain each step would significantly improve readability and maintainability for future developers.

For example, explaining why milliseconds are stripped or why the timezone colon is being manually added would be very helpful.

- SIM103: Return condition directly instead of if/return True - SIM102: Combine nested if statements with 'and'

The test was failing because the new source detection logic requires at least one data source (GitHub token or Jira credentials). Added environment variable mock to provide a valid token.

- Add timezone import for epoch datetime fallback - Use epoch datetime as fallback when created/updated is None - Cast node.get("text") to str to satisfy return type - Add explicit type annotation for results dict in run_extraction

codecov · 2025-11-28T20:17:48Z

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment

Thanks for integrating Codecov - We've got you covered ☂️

The implementation now supports both Jira Cloud and Server/Data Center: - Cloud (API v3): GET /rest/api/3/search/jql with cursor-based pagination - Server/DC (API v2): POST /rest/api/2/search with offset-based pagination The API version is auto-detected from the URL (atlassian.net = Cloud). Added tests for Server/DC path including pagination.

The run_extraction() function was never called - its logic was duplicated inside main(). Removed the unused function to reduce maintenance overhead and avoid inconsistencies.

When jira_projects.txt is missing or empty, user is now prompted with: - [A] Analyze ALL accessible projects - [S] Specify project keys manually (comma-separated) - [L] Select from list by number - [Q] Quit/Skip Jira extraction Added interactive=False parameter for non-interactive/test use cases. Added tests for all interactive prompt options.

…ojects Replace direct print() calls with TerminalOutput.log() for consistency with the rest of the CLI. Added optional 'output' parameter that falls back to print() when not provided (for backward compatibility and tests).

…ersion Explain why milliseconds are stripped and timezone colon is added when converting Jira datetime format to Python fromisoformat() compatible format.

amargiovanni added 5 commits November 28, 2025 17:11

feat(cli): update banner to DEV ANALYZER with Jira support

5b61d3b

- Change ASCII banner from "GITHUB ANALYZER" to "DEV ANALYZER" - Update description to mention GitHub and Jira - Add Jira Integration to features list

gemini-code-assist bot reviewed Nov 28, 2025

View reviewed changes

amargiovanni added 3 commits November 28, 2025 20:47

fix: resolve ruff linting errors in validation.py

118767e

- SIM103: Return condition directly instead of if/return True - SIM102: Combine nested if statements with 'and'

fix: add GITHUB_TOKEN mock to test_returns_0_when_cancelled

031cd73

The test was failing because the new source detection logic requires at least one data source (GitHub token or Jira credentials). Added environment variable mock to provide a valid token.

amargiovanni added 5 commits November 28, 2025 21:20

refactor: remove dead code run_extraction function

0b12e0b

The run_extraction() function was never called - its logic was duplicated inside main(). Removed the unused function to reduce maintenance overhead and avoid inconsistencies.

docs: add detailed comments to _parse_datetime explaining format conv…

3b3010c

…ersion Explain why milliseconds are stripped and timezone colon is added when converting Jira datetime format to Python fromisoformat() compatible format.

amargiovanni merged commit f7870dd into main Nov 28, 2025
5 of 6 checks passed

amargiovanni deleted the 002-jira-integration branch November 28, 2025 20:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(jira): complete Jira integration with multi-source support #3

feat(jira): complete Jira integration with multi-source support #3

Uh oh!

amargiovanni commented Nov 28, 2025

Uh oh!

gemini-code-assist bot commented Nov 28, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Uh oh!

gemini-code-assist bot Nov 28, 2025

Uh oh!

gemini-code-assist bot Nov 28, 2025

Uh oh!

gemini-code-assist bot Nov 28, 2025

Uh oh!

gemini-code-assist bot Nov 28, 2025

Uh oh!

gemini-code-assist bot Nov 28, 2025

Uh oh!

codecov bot commented Nov 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(jira): complete Jira integration with multi-source support #3

feat(jira): complete Jira integration with multi-source support #3

Uh oh!

Conversation

amargiovanni commented Nov 28, 2025

Summary

Features

Environment Variables

Uh oh!

gemini-code-assist bot commented Nov 28, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Nov 28, 2025

Welcome to Codecov 🎉

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants