Skip to content

Conversation

@tristan-f-r
Copy link
Collaborator

@tristan-f-r tristan-f-r commented Oct 10, 2025

We also make graph statistics lazy. Laziness isn't used in summary.py, but I assume that we'll have more computationally expensive graph statistics as SPRAS develops, especially when it can take long to compute for our larger graphs, so this also splits up statistic generation into different rules.

Most importantly, this allows us to re-use statistics by consuming specific statistics as input files.

  • Depends on feat!: SPRAS revision #320 for the summary statistics test (integration testing over unit testing is now required since the heavy workflow lifting is done by Snakemake).

@tristan-f-r tristan-f-r added tuning Workflow-spanning algorithm tuning refactor Changes that don't actually improve anything except for code quality. labels Oct 10, 2025
@read-the-docs-community
Copy link

read-the-docs-community bot commented Oct 10, 2025

Documentation build overview

📚 spras | 🛠️ Build #31218776 | 📁 Comparing 339d915 against latest (18f2cf8)


🔍 Preview build

Show files changed (4 files in total): 📝 4 modified | ➕ 0 added | ➖ 0 deleted
File Status
genindex.html 📝 modified
fordevs/spras.analysis.html 📝 modified
fordevs/spras.config.html 📝 modified
fordevs/spras.html 📝 modified

@agitter
Copy link
Collaborator

agitter commented Nov 7, 2025

Before I can review the implementation of the change, I need to better understand what problem we are tying to solve with the change. Where will laziness be needed in the future?

we can reuse the code for graph heuristic pruning

Do we envision calling graph statistic computation twice per graph? After we compute these statistics on a graph once, shouldn't that be sufficient for an entire pass of a workflow?

@tristan-f-r
Copy link
Collaborator Author

I was going to ask @ntalluri about this, since I wasn't quite sure if we will have expensive graph heuristics or not.

Do we envision calling graph statistic computation twice per graph? After we compute these statistics on a graph once, shouldn't that be sufficient for an entire pass of a workflow?

I did decouple this from analysis: summary: enabled: true, and I imagined it like this. I didn't think about that, though: would it make sense to have graph summary statistics always enabled the moment any heuristics are enabled?

@agitter
Copy link
Collaborator

agitter commented Nov 8, 2025

There could be more than one way to design this sensibly. One would be that if heuristics are enabled in the config file, that automatically generates the graph summary table. The produces more output than requested, which is slightly undesirable.

Another could be to move the heuristic calculations inside each --parameters> subdirectory, which may be where you are headed. If that is written as a file for that one pathway, it could be consumed for heuristics (or used for heuristics and then written to disk). Later, if the graph summary table is requested, it would grab the precomputed statistics from those files in the subdirectories.

@tristan-f-r
Copy link
Collaborator Author

I'll mark this as a draft for now and design something in line with your second proposal.

@tristan-f-r tristan-f-r marked this pull request as draft November 8, 2025 08:06
@ntalluri ntalluri removed the P-medium medium prirotity; this is needed for some external service or another PR label Nov 19, 2025
@tristan-f-r tristan-f-r added the P-high This is a blocker for many PRs/issues/features label Jan 13, 2026
@tristan-f-r tristan-f-r marked this pull request as ready for review January 13, 2026 20:13
@tristan-f-r tristan-f-r added blocked-by-other-pr P-medium medium prirotity; this is needed for some external service or another PR and removed P-high This is a blocker for many PRs/issues/features labels Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocked-by-other-pr P-medium medium prirotity; this is needed for some external service or another PR refactor Changes that don't actually improve anything except for code quality. tuning Workflow-spanning algorithm tuning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants