Skip to content

Fixes for out of memory error #66

Merged
poornimaramesh merged 4 commits intomainfrom
vw/explore
Feb 19, 2026
Merged

Fixes for out of memory error #66
poornimaramesh merged 4 commits intomainfrom
vw/explore

Conversation

@vivwqy
Copy link
Collaborator

@vivwqy vivwqy commented Feb 19, 2026

Encountered out of memory error when running demo_pipeline notebook.

Made 2 main changes to featurizer.core:

  • In function get_caller_counts_per_region, pivot became heavy -- added a line to get distinct regions before running pivot.
  • In function featurize_cdr_data left joins became heavy -- handled this with persist() prior to left joins.

Note: I still get lots of WARN DAGScheduler: Broadcasting large task binary warnings even after these changes. But at least for now the function works to get desired feature outputs.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses out of memory errors encountered when running the demo_pipeline notebook by making two memory optimizations to the featurizer:

Changes:

  • Fixed syntax error in plotting.py by changing nested double quotes to single quotes in f-strings
  • Optimized pivot operation in get_caller_counts_per_region by pre-computing distinct regions list
  • Added persistence to feature DataFrames before joining in featurize_cdr_data to avoid recomputation during multiple joins

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/cider/validation_metrics/plotting.py Fixed syntax error where double quotes were nested inside f-strings - changed to single quotes for dictionary key access
src/cider/featurizer/core.py Added memory optimizations: pre-computed regions list for pivot optimization and persisted feature DataFrames before joins to prevent recomputation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@poornimaramesh poornimaramesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and works. Approving. Thanks for the quick fix!

@poornimaramesh poornimaramesh merged commit c990559 into main Feb 19, 2026
@poornimaramesh poornimaramesh deleted the vw/explore branch February 19, 2026 13:22
@poornimaramesh poornimaramesh restored the vw/explore branch February 19, 2026 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments