Support explicit callable column dependencies in dataset schemas#29
Support explicit callable column dependencies in dataset schemas#29
Conversation
Pull Request Review: Support Explicit Callable Column DependenciesSummaryThis PR adds support for explicit dependency declarations for plain callable columns, allowing users to specify dependencies when automatic detection isn't possible. The implementation is well-designed and includes comprehensive test coverage. ✅ Strengths1. Clean API DesignThe PR provides two intuitive ways to declare dependencies:
Both approaches are well-documented in the README with clear examples. 2. Backwards CompatibilityThe changes maintain full backwards compatibility:
3. Comprehensive Test CoverageThe PR includes excellent test coverage:
4. Proper Dependency MergingThe implementation correctly merges explicit and implicit dependencies ( 🔍 Code Quality ObservationsPositive Aspects
Minor Suggestions1. Type Hint for
|
| Category | Rating | Notes |
|---|---|---|
| Code Quality | ⭐⭐⭐⭐⭐ | Clean, well-structured, follows existing patterns |
| Test Coverage | ⭐⭐⭐⭐☆ | Comprehensive, could add edge case tests |
| Documentation | ⭐⭐⭐⭐☆ | Good examples, could expand docstrings |
| Performance | ⭐⭐⭐⭐⭐ | No concerns, minimal overhead |
| Security | ⭐⭐⭐⭐⭐ | No issues identified |
| API Design | ⭐⭐⭐⭐⭐ | Intuitive, consistent, backwards compatible |
✅ Recommendation
APPROVE with minor suggestions
This is a solid implementation that adds valuable functionality without breaking existing code. The suggested improvements are minor and mostly preventive. The core implementation is sound and ready to merge.
Before Merging
- Consider adding validation warning for non-existent dependencies (optional)
- Add test for mixed explicit/implicit dependencies (optional but recommended)
- Verify all tests pass with
uv run pytest tests/test_dataset_comprehensive.py tests/test_datset.py
Nice to Have (Future)
- More detailed error messages for circular dependencies showing the cycle
- Type stubs or runtime validation for tuple syntax
- Performance benchmarks for large dependency graphs
Great work! This feature will make the library more flexible for complex use cases. 🎉
Pull Request Review - PR #29SummaryThis PR adds explicit dependency support for plain callable columns, providing three convenient ways to declare dependencies: Code Quality ✅Strengths
Code Quality Observations
Potential Issues
|
Pull Request Review: Support explicit callable column dependenciesSummaryThis PR adds support for explicit dependency declaration for plain callables in dataset schemas, complementing the existing automatic dependency detection from generator prompt templates. The implementation is well-designed and thoroughly tested. Code Quality: ✅ ExcellentStrengths:
Best Practices: ✅ Strong
Potential Issues:
|
Pull Request Review: Support explicit callable column dependencies in dataset schemasSummaryThis PR adds explicit dependency declaration support for plain callable columns, allowing users to specify dependencies via ✅ Strengths1. Clean API DesignThe
2. Backward Compatibility
3. Smart Signature InferenceThe def file_content(file_path): # Automatically infers dependency on file_path column
return load_chunk(file_path)4. Comprehensive Test Coverage130+ lines of new tests covering:
5. DocumentationREADME examples clearly demonstrate all three dependency patterns with practical use cases. 🔍 Code Quality Issues1. Type Hints Missing (Critical for codebase with mypy enabled)The PR introduces several new functions/methods without complete type hints, violating the dataset.py:194-196: @staticmethod
def _extract_explicit_dependencies(func: Any) -> List[str]:
"""Extract explicit dependency declarations from schema value."""
deps = [] # Type not specifieddataset.py:377-382: def call(
func: Callable[..., Any],
*dependencies: str,
requires: Optional[List[str]] = None,
with_: Optional[List[str]] = None,
**kwargs, # kwargs type should be Any
) -> DependentCallable:Recommendation: Add complete type hints to all new functions to pass mypy checks. 2. Incomplete
|
Summary
depends_on(...)(callable_fn, ["dep_col"])depends_onfrom package root and document usage in READMETesting