Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR updates the project for October 2025, including dependency version updates, new functionality for saving heatmap data, and improvements to the neighborhood profile visualization workflow.
- Updated parent template version and various package dependencies in the conda environment
- Added capability to save heatmap CSV data alongside visualizations
- Enhanced neighborhood profile functionality with CSV export options and improved cluster comparison handling
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| templateConfig.json | Updated parent template version from 0.231.0 to 0.243.0 |
| .envs/maestro/meta.yaml | Comprehensive update of conda environment dependencies with specific version pinning |
| utils.py | Added filtering to exclude CSV files from heatmaps directory listing |
| time_cell_interaction_lib.py | Added functionality to save heatmap data as CSV files and character replacement in species names |
| pages2/Tool_parameter_selection.py | Added UI control for enabling heatmap data saving |
| pages2/Run_workflow.py | Integrated new save_heatmap_data parameter into workflow execution |
| pages2/Display_average_heatmaps.py | Added UI for creating zip file of saved heatmap CSV files |
| pages2/Neighborhood_Profiles.py | Enhanced neighborhood profile with CSV export capability and removed commented code |
| basic_phenotyper_lib.py | Modified draw_neigh_profile_fig to return density dataframe and added cluster labels |
| pages2/feature_creation.py | Changed dataframe display to show sampled data with resample button |
| pages2/multiaxial_gating.py | Converted columns index to list for selectbox options |
| .publish-dashboards.py | Added support for resourcesV2 configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…e already addressed statements.
… adding the app_a_app_db database to local Docker postgresql.
…reamlit creations.
…. Untested but should come close to working if not working out of the box.
…ropriate per the Lucidchat diagram.
…ce; add fields to tables for consistency and correctness.
…hing of workers with arbitrary compute resources.
…needed when pressing the shutdown button; complete ideal, RBAC, and dynamic snowflake_orchestrator.py; start comprehensive README for Snowflake deployment.
…nd operate on them. Start minimally modifying the launcher.
…code for the new organization scheme.
…data manager service.
…eedback for large datasets; cache the two value counts functionalities in phenotype.py.
…ms to work locally with small dataset, but not on Snowflake with a large one though. That might be because I didn't call .collect(), which I'm doing now.
…en BBBB and CCCC when operating on large images nearing 1M cells each.
… (numpy leads to 17% faster for 400k cells, 75s to 62s). Confirmed that changing to ckdtree doesn't make a difference, but parametrized that anyway.
…idental leave-in of old code that populated df_counts_holder twice.
…ure population slows down, resulting in a break even. So, for now, not using neighbors chunking.
… results, even when some phenotypes are missing. Didn't port to version of fast counts in fnp_main, but about to. So we still get a round 17% speedup with no change in accuracy.
…we can pull into full-stack branch.
…borhood_profiles workflow (changed five files total) since test showed for one of those statements that this prevented a container crash.
…t actually reduce size of final figure but that's somewhat expected.
…y page prior to realizing that species phenotyping causes plotly redraws and investigating that.
…e importantly, write the range index to the in-memory cells dataframe prior to writing it so that we can later scan it using streaming and be able to trust that the added index sumap_cell_index is always consistent. Pick up with going through assign_neighborhood_types.py and rest of HP workflow to assess usage of indices therein.
…et files to disk (two instances total) so that subsequent reads of the intermediate files don't need .with_row_index() after scanning, which could lead to different row indices each time a scan is done if the streaming engine is used. Also, clean up all files, particularly names and usages of indices in order to make things more clear.
…that will change streaming to in-memory.
…edrawing due to streaming; sort by an index so plotly sees the same figure between reruns.
…ithful_columns) a frame that only has one unique input_index per row: the unified lazyframe, as opposed to lf_phenotyped which in the marker case could have duplicated input_indices. Just in case I forget, add a guard for uniqueness on that RHS in the plotting function itself right at the join. Also add a note that the guard isn't in place for a pandas df lookup table, but hopefully we'd only ever use polars anyway.
…s when loading previously calculated results.
…lement more robust clearing of session state keys when jobs (via buttons) are run.
…ession state rather than in the cache.
…s of the workflow (load_unified_input_file.py and phenotype.py) and set as the exported figure name the format similar to what Robert wanted.
…back in the app more apparent (loading of an archive and submission of a job).
…6e5d1015534684691159b2c of git@github.com:CBIIT/snowflake-knowledge-hub.git that allows Snowflake connections the same way whether from SPCS or from a local machine.
…t downstream results will be overwritten.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces several enhancements and improvements across the environment configuration and multiple Streamlit application pages, focusing on environment reproducibility, data export functionality, and usability for cluster/heatmap analysis. The most important changes are summarized below.
Environment and Dependency Management:
.envs/maestro/meta.yamlto update, pin, and reorganize dependencies for improved reproducibility and compatibility, including explicit versioning for key packages and consolidating requirements between conda and pip.Neighborhood Profiles Export and Usability:
Enhanced the neighborhood profiles workflow in
pages2/Neighborhood_Profiles.pyandbasic_phenotyper_lib.pyto allow users to export cluster comparison results as CSV files directly to the output folder, both for individual and batch (subplot) analyses. This includes capturing the computed dataframes, managing file suffixes, and providing UI buttons for saving results. [1] [2] [3] [4] [5] [6] [7] [8] [9]Improved cluster list management to dynamically include "Average Left" and "Average Right" when cluster difference toggles are enabled, ensuring correct options are available throughout the workflow. [1] [2]
Average Heatmap Data Export:
pages2/Tool_parameter_selection.pyfor users to enable saving of slide-averaged heatmap data, and propagated this setting through the workflow. Thepages2/Display_average_heatmaps.pypage now provides a button to zip and export all relevant heatmap data CSVs, improving downstream data accessibility. [1] [2] [3] [4] [5]Miscellaneous Improvements:
resourcesV2in.publish-dashboards.py.pages2/feature_creation.py.pages2/Neighborhood_Profiles.py.These changes collectively improve the reproducibility, usability, and data export capabilities of the application, making it easier for users to manage, analyze, and extract results from their workflows.