Skip to content

Conversation

@dsariel
Copy link
Contributor

@dsariel dsariel commented May 2, 2025

Test operator reports scraper

- Connects to downstream Zuul server
- Gets tempest reports for failed jobs in an openstack-uni-jobs-periodic-integration-rhoso18.0-rhel9
  pipeline with default cutoff date of 14 days
- Creates points in Qdrant DB for every failed test/traceback

TODO (will be implemented as a separate PR):
 - Tobico tests parser
 - Add an argument to the scraper to skip the stage of connecting to the Zuul server. that will be used
   for storing test/traceback points directly from the Zuul jobs

Co-authored-by: lpiwowar lpiwowar@users.noreply.github.com

@dsariel dsariel requested a review from lpiwowar May 2, 2025 10:39
@dsariel dsariel self-assigned this May 2, 2025
@dsariel dsariel force-pushed the tempest_scraper branch 2 times, most recently from 3de8b89 to 92cb93e Compare May 6, 2025 11:38
@dsariel dsariel changed the title [DNM] tempest scraper Test operator reports scraper May 6, 2025
dsariel added a commit that referenced this pull request May 6, 2025
Comletes the part of scrapping test operator logs by storing points into Qdrant DB.
Started in #46
dsariel added a commit that referenced this pull request May 6, 2025
Comletes the part of scrapping test operator logs by storing points into Qdrant DB.
Started in #46
dsariel added a commit that referenced this pull request May 6, 2025
Comletes the part of scrapping test operator logs by storing points into Qdrant DB.
Started in #46
@dsariel dsariel force-pushed the tempest_scraper branch 3 times, most recently from be41c32 to 7ab5eab Compare May 7, 2025 11:54
@dsariel dsariel force-pushed the tempest_scraper branch 2 times, most recently from a8f8780 to 9037eca Compare May 7, 2025 12:03
Copy link
Contributor

@lpiwowar lpiwowar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I did not manage to review this entirely.

But I was able to run it locally and populate the vector database, though 🎉 (I limited the scraping only to 5 jobs). The generated data looked reasonable!:)

I would need a bit more time to merge this with clear consciousness (to investigate a bit the parsing and the rest of the code a bit more carefully). But if somebody wants to merge it, especially since it is a bit urgent, I won't object it. We can refactor and polish always later.

@dsariel dsariel force-pushed the tempest_scraper branch from 9037eca to cebd81f Compare May 8, 2025 09:33
@dsariel
Copy link
Contributor Author

dsariel commented May 8, 2025

Apologies, I did not manage to review this entirely.

But I was able to run it locally and populate the vector database, though 🎉 (I limited the scraping only to 5 jobs). The generated data looked reasonable!:)

I would need a bit more time to merge this with clear consciousness (to investigate a bit the parsing and the rest of the code a bit more carefully). But if somebody wants to merge it, especially since it is a bit urgent, I won't object it. We can refactor and polish always later.

np. Meanwhile, I have added 2 additional pipelines and an additional argument to use this script from Zuul pipelines.

@dsariel dsariel force-pushed the tempest_scraper branch from 48b8cd1 to 6988a42 Compare May 8, 2025 10:41
@dsariel dsariel requested review from EmilienM, jpodivin, rabi and sbekkerm May 8, 2025 10:45
dsariel and others added 2 commits May 8, 2025 14:12
- Connects to downstream Zuul server
- Gets tempest reports for failed jobs in an openstack-uni-jobs-periodic-integration-rhoso18.0-rhel9
  pipeline with default cutoff date of 14 days
- Creates points in Qdrant DB for every failed test/traceback

TODO (will be implemented as a separate PR):
 - Tobico tests parser
 - Add an argument to the scraper to skip the stage of connecting to the Zuul server. that will be used
   for storing test/traceback points directly from the Zuul jobs

Co-authored-by: lpiwowar <lpiwowar@users.noreply.github.com>
… recreated

Qdrant snapshots are stored separately from the collections and therefore peserved
when the collection is deleted, enablling recovery if needed.
@dsariel dsariel force-pushed the tempest_scraper branch from 6988a42 to 44dbd6b Compare May 8, 2025 11:13
return tempest_failures


# # pylint: disable=too-few-public-methods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we comment code?

Copy link
Contributor

@EmilienM EmilienM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for now, one minor comment but we can iterate later.

@EmilienM
Copy link
Contributor

EmilienM commented May 8, 2025

Apologies, I did not manage to review this entirely.

But I was able to run it locally and populate the vector database, though 🎉 (I limited the scraping only to 5 jobs). The generated data looked reasonable!:)

I would need a bit more time to merge this with clear consciousness (to investigate a bit the parsing and the rest of the code a bit more carefully). But if somebody wants to merge it, especially since it is a bit urgent, I won't object it. We can refactor and polish always later.

I agree. It's not perfect now but we'll iterate, as we have other stuffs to merge.

@EmilienM EmilienM merged commit 908ff34 into main May 8, 2025
3 checks passed
@EmilienM EmilienM deleted the tempest_scraper branch May 8, 2025 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants