Conversation
Summary of ChangesHello @julenmendieta, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly overhauls the project's technical foundation by modernizing its build pipeline and dependency management, leading to a more robust and maintainable codebase. Concurrently, it delivers substantial performance gains in data processing for dimensionality reduction tasks and introduces a key feature for biological analysis, all while enhancing the reproducibility of results. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant performance and memory efficiency improvements, primarily by refactoring the Python data processing scripts to use polars and a more efficient method for creating sparse matrices. A new feature for selecting Highly Variable Genes (HVG) has been added across the stack (backend, model, UI), which further enhances performance and analytical focus. The build system and dependencies have also been substantially updated for better maintainability. My review focuses on ensuring the new logic is robust and consistent. I've identified a couple of minor issues: a misleading log message and an opportunity for further memory optimization by using a consistent data type. I also noted a potential UI improvement for a more robust user experience. Overall, this is a high-quality contribution that significantly improves the block.
software/src/batch_correction.py
Outdated
There was a problem hiding this comment.
The sparse matrix is being created with dtype=np.float64. While this provides higher precision, scanpy and many single-cell analysis tools typically operate on np.float32 to conserve memory, which is particularly important for large datasets. The accompanying batch_correction.py script correctly uses np.float32. For consistency and to align with the performance goals of this PR, it would be better to use np.float32 here as well.
| dtype=np.float64 | |
| dtype=np.float32 |
| <PlNumberField | ||
| v-if="app.model.args.hvgEnabled" | ||
| v-model="app.model.args.hvgCount" | ||
| label="Number of HVG" | ||
| :min-value="app.model.args.nPCs + 1" | ||
| :step="100" | ||
| /> |
There was a problem hiding this comment.
The min-value for the "Number of HVG" is correctly bound to app.model.args.nPCs + 1. However, this only enforces the minimum at the UI level. If a user sets hvgCount and then increases nPCs to a value hvgCount - 1 or greater, the model value for hvgCount will become invalid but won't be automatically adjusted. This could lead to downstream errors.
For a more robust user experience, consider adding a watcher on app.model.args.nPCs to automatically adjust hvgCount if it becomes invalid.
import { watch } from 'vue';
watch(() => app.model.args.nPCs, (newNPCs) => {
if (app.model.args.hvgEnabled && app.model.args.hvgCount <= newNPCs) {
app.model.args.hvgCount = newNPCs + 1;
}
});
No description provided.