Skip to content

update dependencies; debug#62

Merged
poornimaramesh merged 5 commits intomainfrom
move_date_creation_toa_separate_preprocessing_step
Feb 16, 2026
Merged

update dependencies; debug#62
poornimaramesh merged 5 commits intomainfrom
move_date_creation_toa_separate_preprocessing_step

Conversation

@poornimaramesh
Copy link
Collaborator

Changes

Pin pandas and pyspark dependencies; debug the package after these changes
Move the creation of a "day" column to a separate dependency function in featurizer; debug after this change

How has this been tested?

Run tests -- check that they all pass
Run featurizer.ipynb;

Checklist

Fill with x for completed.

  • I have run pre-commit hooks locally
  • I have resolved merge conflicts
  • I have updated the automated tests (if applicable)
  • I have updated the requirements (if applicable)
  • I have updated affected documentation (if applicable)

Copilot AI review requested due to automatic review settings February 16, 2026 15:38
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR pins pandas and pyspark dependencies to specific versions (pandas 2.1.4, pyspark 3.5.8) and refactors the creation of the "day" column into a separate helper function to improve code maintainability and consistency. The changes also include compatibility updates for the older pandas version and Python version downgrade from 3.13 to 3.10-3.11.

Changes:

  • Pin pandas to 2.1.4 and pyspark to 3.5.8, downgrade Python requirement to 3.10-3.11, and update Java requirement to version 11
  • Refactor day column creation into a new add_day_column() helper function and introduce CallDataRecordDataWithDay schema
  • Update pandas groupby operations to use group_keys=False instead of include_groups=False for compatibility with pandas 2.1.4

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pyproject.toml Pin pandas and pyspark versions, downgrade Python requirement to 3.10-3.11, remove some type stubs
Makefile Update Java version requirement from 17 to 11
notebooks/featurizer.ipynb Update Java path and Python version, add calls to add_day_column() for data preprocessing
notebooks/demo_pipeline.ipynb Update Python version to 3.10.19
src/cider/featurizer/schemas.py Add CallDataRecordDataWithDay schema and update CallDataRecordTagged to inherit from it
src/cider/featurizer/dependencies.py Add add_day_column() function, update functions to expect data with day column, fix condition initialization to use lit(True), update pandas groupby operations
src/cider/featurizer/core.py Update preprocess_data to call add_day_column()
src/cider/validation_metrics/dependencies.py Update pandas groupby to use group_keys=False for compatibility
src/cider/validation_metrics/core.py Update pandas groupby to use group_keys=False for compatibility
tests/test_featurizer.py Add test for add_day_column(), update tests to call add_day_column() before processing, update error messages and validation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

poornimaramesh and others added 3 commits February 16, 2026 21:14
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@poornimaramesh poornimaramesh merged commit 416f74a into main Feb 16, 2026
1 check passed
@poornimaramesh poornimaramesh deleted the move_date_creation_toa_separate_preprocessing_step branch February 16, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments