Skip to content

Comments

Develop update for Oct 2025#169

Open
djsmith17 wants to merge 343 commits intomainfrom
develop
Open

Develop update for Oct 2025#169
djsmith17 wants to merge 343 commits intomainfrom
develop

Conversation

@djsmith17
Copy link
Contributor

@djsmith17 djsmith17 commented Oct 21, 2025

This pull request introduces several enhancements and improvements across the environment configuration and multiple Streamlit application pages, focusing on environment reproducibility, data export functionality, and usability for cluster/heatmap analysis. The most important changes are summarized below.

Environment and Dependency Management:

  • Refactored .envs/maestro/meta.yaml to update, pin, and reorganize dependencies for improved reproducibility and compatibility, including explicit versioning for key packages and consolidating requirements between conda and pip.

Neighborhood Profiles Export and Usability:

  • Enhanced the neighborhood profiles workflow in pages2/Neighborhood_Profiles.py and basic_phenotyper_lib.py to allow users to export cluster comparison results as CSV files directly to the output folder, both for individual and batch (subplot) analyses. This includes capturing the computed dataframes, managing file suffixes, and providing UI buttons for saving results. [1] [2] [3] [4] [5] [6] [7] [8] [9]

  • Improved cluster list management to dynamically include "Average Left" and "Average Right" when cluster difference toggles are enabled, ensuring correct options are available throughout the workflow. [1] [2]

Average Heatmap Data Export:

  • Added an option in pages2/Tool_parameter_selection.py for users to enable saving of slide-averaged heatmap data, and propagated this setting through the workflow. The pages2/Display_average_heatmaps.py page now provides a button to zip and export all relevant heatmap data CSVs, improving downstream data accessibility. [1] [2] [3] [4] [5]

Miscellaneous Improvements:

  • Updated dashboard publishing logic to support and enforce the use of resourcesV2 in .publish-dashboards.py.
  • Minor code cleanup and import fixes, such as adding a missing import in pages2/feature_creation.py.
  • Removed obsolete or commented-out UI code to streamline the interface in pages2/Neighborhood_Profiles.py.

These changes collectively improve the reproducibility, usability, and data export capabilities of the application, making it easier for users to manage, analyze, and extract results from their workflows.

@djsmith17 djsmith17 requested a review from Copilot October 21, 2025 19:40
@djsmith17 djsmith17 self-assigned this Oct 21, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the project for October 2025, including dependency version updates, new functionality for saving heatmap data, and improvements to the neighborhood profile visualization workflow.

  • Updated parent template version and various package dependencies in the conda environment
  • Added capability to save heatmap CSV data alongside visualizations
  • Enhanced neighborhood profile functionality with CSV export options and improved cluster comparison handling

Reviewed Changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
templateConfig.json Updated parent template version from 0.231.0 to 0.243.0
.envs/maestro/meta.yaml Comprehensive update of conda environment dependencies with specific version pinning
utils.py Added filtering to exclude CSV files from heatmaps directory listing
time_cell_interaction_lib.py Added functionality to save heatmap data as CSV files and character replacement in species names
pages2/Tool_parameter_selection.py Added UI control for enabling heatmap data saving
pages2/Run_workflow.py Integrated new save_heatmap_data parameter into workflow execution
pages2/Display_average_heatmaps.py Added UI for creating zip file of saved heatmap CSV files
pages2/Neighborhood_Profiles.py Enhanced neighborhood profile with CSV export capability and removed commented code
basic_phenotyper_lib.py Modified draw_neigh_profile_fig to return density dataframe and added cluster labels
pages2/feature_creation.py Changed dataframe display to show sampled data with resample button
pages2/multiaxial_gating.py Converted columns index to list for selectbox options
.publish-dashboards.py Added support for resourcesV2 configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

… adding the app_a_app_db database to local Docker postgresql.
…. Untested but should come close to working if not working out of the box.
…ce; add fields to tables for consistency and correctness.
…hing of workers with arbitrary compute resources.
…needed when pressing the shutdown button; complete ideal, RBAC, and dynamic snowflake_orchestrator.py; start comprehensive README for Snowflake deployment.
…nd operate on them. Start minimally modifying the launcher.
…eedback for large datasets; cache the two value counts functionalities in phenotype.py.
…ms to work locally with small dataset, but not on Snowflake with a large one though. That might be because I didn't call .collect(), which I'm doing now.
…en BBBB and CCCC when operating on large images nearing 1M cells each.
… (numpy leads to 17% faster for 400k cells, 75s to 62s). Confirmed that changing to ckdtree doesn't make a difference, but parametrized that anyway.
…idental leave-in of old code that populated df_counts_holder twice.
…ure population slows down, resulting in a break even. So, for now, not using neighbors chunking.
… results, even when some phenotypes are missing. Didn't port to version of fast counts in fnp_main, but about to. So we still get a round 17% speedup with no change in accuracy.
…borhood_profiles workflow (changed five files total) since test showed for one of those statements that this prevented a container crash.
…t actually reduce size of final figure but that's somewhat expected.
…y page prior to realizing that species phenotyping causes plotly redraws and investigating that.
…e importantly, write the range index to the in-memory cells dataframe prior to writing it so that we can later scan it using streaming and be able to trust that the added index sumap_cell_index is always consistent. Pick up with going through assign_neighborhood_types.py and rest of HP workflow to assess usage of indices therein.
…et files to disk (two instances total) so that subsequent reads of the intermediate files don't need .with_row_index() after scanning, which could lead to different row indices each time a scan is done if the streaming engine is used. Also, clean up all files, particularly names and usages of indices in order to make things more clear.
…edrawing due to streaming; sort by an index so plotly sees the same figure between reruns.
…ithful_columns) a frame that only has one unique input_index per row: the unified lazyframe, as opposed to lf_phenotyped which in the marker case could have duplicated input_indices. Just in case I forget, add a guard for uniqueness on that RHS in the plotting function itself right at the join. Also add a note that the guard isn't in place for a pandas df lookup table, but hopefully we'd only ever use polars anyway.
…s when loading previously calculated results.
…lement more robust clearing of session state keys when jobs (via buttons) are run.
…s of the workflow (load_unified_input_file.py and phenotype.py) and set as the exported figure name the format similar to what Robert wanted.
…back in the app more apparent (loading of an archive and submission of a job).
…6e5d1015534684691159b2c of git@github.com:CBIIT/snowflake-knowledge-hub.git that allows Snowflake connections the same way whether from SPCS or from a local machine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants