Skip to content

Conversation

@sosiouxme
Copy link
Member

@sosiouxme sosiouxme commented Jan 13, 2026

During the fetchdata CronJob, read the labels applied in BQ and include them in the job run "labels" array entry. This imports job labels with sippy job runs so they will at least show up on the job runs page.

The first commit is a minor refactor that should change nothing but style. The second is the commit to focus on.

Summary by CodeRabbit

  • New Features

    • Added BigQuery-backed label gathering for prow job runs and integrated labels into job run records.
    • Introduced a consolidated pathway for processing job runs from storage, including test extraction and label population.
  • Refactor

    • Centralized prow job creation/update and coordinated caching/locking to avoid duplicate processing and simplify DB writes.
    • Streamlined job run insertion to include labels and preserve batched test writes.

✏️ Tip: You can customize this high-level summary in your review settings.

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 13, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 13, 2026

@sosiouxme: This pull request references TRT-2469 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

During the fetchdata CronJob, read the labels applied in BQ and include them in the job run "labels" array entry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

Walkthrough

Reworked prow job run processing: removed direct GCS-driven cluster data unmarshalling, added BigQuery-backed label gathering (GatherLabelsFromBQ), centralized ProwJob create/update (createOrUpdateProwJob), and introduced processGCSBucketJobRun to handle test extraction, label retrieval, and DB writes for ProwJobRun records.

Changes

Cohort / File(s) Summary
Prow loader core
pkg/dataloader/prowloader/prow.go
Removed in-file GCS-based GetClusterData usage. Added exported GatherLabelsFromBQ(ctx, bqClient, buildID, startTime) and constants LabelsDatasetEnv and LabelsTableName. Introduced createOrUpdateProwJob() and processGCSBucketJobRun(); refactored prowJobToJobRun() to use caching, centralize ProwJob creation/update, early-return for cached runs, invoke BigQuery labels, and include Labels in ProwJobRun DB insertion. Added imports for os and github.com/lib/pq.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ProwLoader as Prow Loader
    participant GCS as GCS Bucket
    participant BQ as BigQuery
    participant DB as Database

    Caller->>ProwLoader: prowJobToJobRun(pj, release)
    alt Job Run Already Processed (cache)
        ProwLoader-->>Caller: return
    else New Run
        ProwLoader->>ProwLoader: createOrUpdateProwJob(pj)
        ProwLoader->>DB: Create/Update ProwJob
        DB-->>ProwLoader: dbProwJob

        ProwLoader->>GCS: fetch junit / artifacts
        GCS-->>ProwLoader: test data, buildID, startTime

        ProwLoader->>BQ: GatherLabelsFromBQ(buildID, startTime)
        BQ-->>ProwLoader: labels array

        ProwLoader->>DB: Create ProwJobRun (with Labels)
        ProwLoader->>DB: Batch insert test rows
        DB-->>ProwLoader: insert confirmations

        ProwLoader-->>Caller: done
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 3 warnings)
Check name Status Explanation Resolution
Sql Injection Prevention ❌ Error The GatherLabelsFromBQ function constructs the table name by concatenating the dataset name from an environment variable directly into the SQL query without proper parameterization, enabling SQL injection attacks. Replace table name string concatenation with proper parameter binding or validate the dataset name against a whitelist of allowed values before query construction.
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Go Error Handling ⚠️ Warning Code has two Go error handling violations: fmt.Errorf missing %w wrapper at line 314 and panic at line 1166 outside init() for recoverable condition. Replace fmt.Errorf with %w wrapper at line 314 and replace panic with proper error handling at line 1166 to allow graceful error propagation.
Single Responsibility And Clear Naming ⚠️ Warning The processGCSBucketJobRun method violates single responsibility by bundling test extraction, label gathering, DB insertion, and test batching into one method with 6 parameters and uses the generic verb 'process' instead of specific action-oriented names. Refactor processGCSBucketJobRun into smaller focused methods with specific names like extractTestsFromGCS, insertProwJobRunWithLabels, and batchInsertTests to separate concerns and reduce parameters.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'TRT-2469: Read/write job labels during fetchdata' accurately describes the main change: integrating BigQuery-based label reading/writing into the fetchdata flow, which is the primary purpose of the pull request.
Excessive Css In React Should Use Styles ✅ Passed This custom check targets React components with excessive inline CSS styling. The pull request modifies a Go backend file with no React, JavaScript, or CSS code, making this check irrelevant.
✨ Finishing touches
  • 📝 Generate docstrings

🧹 Recent nitpick comments
pkg/dataloader/prowloader/prow.go (1)

927-927: Remove .Debug() call before merging.

The .Debug() method will log all SQL statements for this batch insert, which is likely leftover from development. This could produce excessive log output in production.

Proposed fix
-	err = pl.dbc.DB.WithContext(ctx).Debug().CreateInBatches(tests, 1000).Error
+	err = pl.dbc.DB.WithContext(ctx).CreateInBatches(tests, 1000).Error
📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 8c08b21 and c17f208.

📒 Files selected for processing (1)
  • pkg/dataloader/prowloader/prow.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/dataloader/prowloader/prow.go (1)
pkg/db/models/prow.go (2)
  • ProwJob (15-26)
  • ProwJobRun (36-66)
🔇 Additional comments (7)
pkg/dataloader/prowloader/prow.go (7)

10-10: LGTM!

The new imports for os and pq are correctly added to support the label gathering functionality.

Also applies to: 23-23


597-606: LGTM!

The refactored pending comments handling is clean. Error handling correctly ignores gorm.ErrRecordNotFound while logging other errors.


822-842: LGTM!

Good refactoring that improves code organization:

  • Lock handling is correct with immediate unlock after cache operations
  • Read lock (RLock) appropriately used for cache lookup
  • Early return for already-processed job runs is efficient

845-882: LGTM!

Clean extraction of ProwJob creation/update logic. The upsert pattern with clause.OnConflict{UpdateAll: true} correctly handles concurrent creation attempts, and the cache is properly maintained after DB operations.


892-895: Consider whether label fetch failure should be non-fatal.

If BigQuery is temporarily unavailable, GatherLabelsFromBQ will return an error causing the entire job run import to fail. Since labels are metadata enrichment rather than core data, you may want to log a warning and continue with empty labels instead.

If this strict error handling is intentional (e.g., labels are critical for downstream processing), please disregard. Otherwise, consider:

Proposed resilient approach
 	labels, err := GatherLabelsFromBQ(ctx, pl.bigQueryClient, pj.Status.BuildID, pj.Status.StartTime)
 	if err != nil {
-		return err
+		log.WithField("buildID", pj.Status.BuildID).WithError(err).Warning("failed to gather labels from BQ, continuing without labels")
+		labels = nil
 	}

1040-1087: LGTM!

The GatherLabelsFromBQ function is well-structured:

  • Graceful nil handling when bqClient is nil
  • Proper use of parameterized queries for user-supplied values (BuildID, StartTime)
  • Correct handling of empty results via iterator.Done
  • The table identifier uses backticks for proper BigQuery escaping

The environment variable fallback for the dataset is a good pattern for configuration flexibility.


902-918: LGTM!

The ProwJobRun creation correctly integrates the new Labels field. The data flow from BigQuery through to database storage is clean and consistent with the model definition.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from deads2k and neisw January 13, 2026 22:46
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sosiouxme

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 13, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 13, 2026

@sosiouxme: This pull request references TRT-2469 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

During the fetchdata CronJob, read the labels applied in BQ and include them in the job run "labels" array entry.

Summary by CodeRabbit

  • Refactor
  • Improved label retrieval for prow job runs by integrating BigQuery-based data gathering, enabling more reliable label handling in job run processing.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @pkg/dataloader/prowloader/prow.go:
- Around line 892-895: The call to gatherLabelsFromBQ currently aborts
processing by returning the error; instead, catch the error, log it via the
prowloader's logger on pl (e.g., pl.logger or pl.log) with context that label
gathering failed, then set labels = []string{} and continue processing so a
transient/missing BQ issue won't fail the entire import; update the block around
gatherLabelsFromBQ(pj) to log the error and continue rather than return it.
🧹 Nitpick comments (1)
pkg/dataloader/prowloader/prow.go (1)

927-927: Remove Debug() call before merging.

The .Debug() call on GORM will log all SQL queries, which can be noisy in production. Consider removing it or gating it behind a debug flag.

Proposed fix
-	err = pl.dbc.DB.WithContext(ctx).Debug().CreateInBatches(tests, 1000).Error
+	err = pl.dbc.DB.WithContext(ctx).CreateInBatches(tests, 1000).Error
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 06f520b and 8c08b21.

📒 Files selected for processing (1)
  • pkg/dataloader/prowloader/prow.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/dataloader/prowloader/prow.go (3)
pkg/db/models/prow.go (2)
  • ProwJob (15-26)
  • ProwJobRun (36-66)
pkg/db/db.go (1)
  • DB (27-33)
pkg/apis/sippyprocessing/v1/types.go (1)
  • JobSucceeded (72-72)
🔇 Additional comments (6)
pkg/dataloader/prowloader/prow.go (6)

10-10: LGTM!

The new imports for os (environment variable access) and github.com/lib/pq (PostgreSQL array types) are correctly added and necessary for the new label-gathering functionality.

Also applies to: 23-23


597-606: LGTM!

The refactored pending comments handling is cleaner with proper error handling for gorm.ErrRecordNotFound.


822-842: LGTM!

The refactored flow correctly separates concerns:

  1. ProwJob creation/update with proper lock handling
  2. Cache check with read lock
  3. GCS bucket processing delegation

The lock release before error checking is the correct pattern here.


845-882: LGTM!

Clean extraction of the ProwJob creation/update logic into a reusable helper. The return signature (*models.ProwJob, error) properly enables the caller to get the created/updated job for further use.


1040-1089: LGTM!

The label gathering implementation is well-structured:

  • Graceful handling when BigQuery client is unavailable (returns nil, nil)
  • Safe query construction - dataset and table name are server-controlled, not user input
  • Parameterized queries for user-facing values (BuildID, StartTime)
  • Correct BigQuery iterator pattern with iterator.Done check

The DATE(prowjob_start) filter is a good practice for BigQuery partition pruning.


1052-1055: Verify the fallback dataset behavior is intentional.

When JOB_LABELS_DATASET is not set, the code falls back to pl.bigQueryClient.Dataset. Ensure this is the intended behavior and that the job_labels table exists in the default dataset, or document that the env var must be set for label gathering to work.

Comment on lines 892 to 895
labels, err := pl.gatherLabelsFromBQ(pj)
if err != nil {
return err
}
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the actual code around lines 892-895
cat -n pkg/dataloader/prowloader/prow.go | sed -n '884,932p'

Repository: openshift/sippy

Length of output: 1855


🏁 Script executed:

# Let's also find and examine the gatherLabelsFromBQ function
ast-grep --pattern 'func ($_ *$_) gatherLabelsFromBQ($_) {
  $$$
}'

Repository: openshift/sippy

Length of output: 41


🏁 Script executed:

# Search for how labels are used and handled throughout the codebase
rg "gatherLabelsFromBQ|job_labels" -A 3 -B 3 --no-heading

Repository: openshift/sippy

Length of output: 5589


🏁 Script executed:

# Find the full implementation of gatherLabelsFromBQ
rg "func.*gatherLabelsFromBQ" -A 30 --no-heading

Repository: openshift/sippy

Length of output: 1968


🏁 Script executed:

# Check how labels are defined in the ProwJobRun model
rg "type ProwJobRun struct" -A 50 | grep -A 50 "type ProwJobRun"

Repository: openshift/sippy

Length of output: 3243


🏁 Script executed:

# Search for other places where gatherLabelsFromBQ might be called or tested
rg "gatherLabelsFromBQ" -B 2 -A 2

Repository: openshift/sippy

Length of output: 927


🏁 Script executed:

# Get the complete gatherLabelsFromBQ function implementation
rg "func.*gatherLabelsFromBQ" -A 50 | head -70

Repository: openshift/sippy

Length of output: 3218


🏁 Script executed:

# Check if there are any tests that exercise error cases for gatherLabelsFromBQ
rg "gatherLabelsFromBQ|TestGatherLabels" -B 2 -A 5

Repository: openshift/sippy

Length of output: 1222


🏁 Script executed:

# Look for documentation or comments about label criticality
rg "Labels|job_labels" -B 1 -A 1 | grep -A 2 -B 2 "optional\|required\|critical\|essential"

Repository: openshift/sippy

Length of output: 50373


Consider whether label gathering failures should halt job run processing.

Currently, if gatherLabelsFromBQ returns an error, the entire job run import fails. Since labels are populated from optional symptom detection or manual annotation and the function already gracefully handles missing BigQuery clients, you should consider logging the error and continuing with an empty labels array instead. This approach would make the import more resilient to transient BigQuery issues or schema differences across environments.

🤖 Prompt for AI Agents
In @pkg/dataloader/prowloader/prow.go around lines 892 - 895, The call to
gatherLabelsFromBQ currently aborts processing by returning the error; instead,
catch the error, log it via the prowloader's logger on pl (e.g., pl.logger or
pl.log) with context that label gathering failed, then set labels = []string{}
and continue processing so a transient/missing BQ issue won't fail the entire
import; update the block around gatherLabelsFromBQ(pj) to log the error and
continue rather than return it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's what I want. If something is transiently wrong with querying BQ, I would like to fail the run and try again the next time the cron job runs; otherwise the job import will be considered complete and will never get the labels. But this is a useful comment for a reviewer to consider whether I have it right here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sosiouxme That makes sense! Your reasoning is correct: given the caching mechanism in prowJobRunCache, once a job run ID is recorded as processed, it won't be re-attempted on subsequent cron executions.

The current implementation aligns with your goal:

  1. If gatherLabelsFromBQ fails → error is returned
  2. Job run is NOT inserted into the database
  3. Job run is NOT added to prowJobRunCache
  4. Next cron run will retry the same job run, including label gathering

This prioritizes data completeness over resilience to transient errors, which is a valid design choice for this use case where labels are an integral part of the job run record. The trade-off is that a prolonged BigQuery outage would block job run imports, but that seems acceptable given labels are meant to be part of the permanent record.

@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e

prepare for expanding the loader to handle job labels.

from too-long method prowJobToJobRun break out methods
createOrUpdateProwJob and processGCSBucketJobRun.

remove GetClusterData which has no evident caller in sippy or cloud
functions.

other minor simplifications.
@sosiouxme sosiouxme force-pushed the 20260112-TRT-2469-fetchdata-labels branch from 8c08b21 to c17f208 Compare January 13, 2026 23:50
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 13, 2026

@sosiouxme: This pull request references TRT-2469 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

During the fetchdata CronJob, read the labels applied in BQ and include them in the job run "labels" array entry. The first commit is a minor refactor that should change nothing but style. The second is the commit to focus on.

Summary by CodeRabbit

  • Refactor
  • Improved label retrieval for prow job runs by integrating BigQuery-based data gathering, enabling more reliable label handling in job run processing.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 13, 2026

@sosiouxme: This pull request references TRT-2469 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

During the fetchdata CronJob, read the labels applied in BQ and include them in the job run "labels" array entry. The first commit is a minor refactor that should change nothing but style. The second is the commit to focus on.

Summary by CodeRabbit

  • New Features

  • Added BigQuery-backed label gathering for prow job runs and integrated labels into job run records.

  • Introduced a consolidated pathway for processing job runs from storage, including test extraction and label population.

  • Refactor

  • Centralized prow job creation/update and coordinated caching/locking to avoid duplicate processing and simplify DB writes.

  • Streamlined job run insertion to include labels and preserve batched test writes.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 13, 2026

@sosiouxme: This pull request references TRT-2469 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

During the fetchdata CronJob, read the labels applied in BQ and include them in the job run "labels" array entry. This imports job labels with sippy job runs so they will at least show up on the job runs page.

The first commit is a minor refactor that should change nothing but style. The second is the commit to focus on.

Summary by CodeRabbit

  • New Features

  • Added BigQuery-backed label gathering for prow job runs and integrated labels into job run records.

  • Introduced a consolidated pathway for processing job runs from storage, including test extraction and label population.

  • Refactor

  • Centralized prow job creation/update and coordinated caching/locking to avoid duplicate processing and simplify DB writes.

  • Streamlined job run insertion to include labels and preserve batched test writes.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2026

@sosiouxme: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants