-
Notifications
You must be signed in to change notification settings - Fork 3
Feat: semantic insights generator structure #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5424e66 to
c6c7663
Compare
| test_dataset_with_strategy, | ||
| ) | ||
|
|
||
| __all__ = ['problem', 'real_llm_with_spec', 'test_dataset_with_strategy'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure we want to share ['problem', 'real_llm_with_spec', 'test_dataset_with_strategy'] between all generators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we'll definitely want a separate, non conversation tabular data, I just didn't want to decide on it in this PR, so this is a placeholder, for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then define an placeholder ['problem', 'real_llm_with_spec', 'test_dataset_with_strategy'] in 'semantic_insights_generator/test_e2e.py' and remove this conftest.py.
| generation_model: LLMWithSpec | ||
| repair_model: LLMWithSpec | ||
| seed: int | None = None | ||
| validator: FeatureValidator = field(factory=LawAndOrderValidator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be a list
Create minimal SemanticInsightsGenerator module with validation integration: - SemanticInsightsGenerator: Main orchestrator implementing FeatureGenerator - BasicFeatureGenerator: Core generation logic with validate_and_retry loop - LLMSqlFeatureCorrector: LLM-based SQL repair stub (TODO implementation) - GeneratedFeatureSpec: Minimal schema for LLM-generated features - Configurable validator and retry budgets with module defaults - Uses official validation infrastructure (LawAndOrderValidator, validate_and_retry) - Streaming feature generation (yields immediately after validation) - Seed parameter for reproducible generation
- Created nested test folder: tests/agentune/analyze/feature/gen/semantic_insights_generator/ - test_e2e.py: Validates generator instantiation and API contract - Expects empty results until LLM generation is implemented - refactor: Move InsightfulTextGenerator tests into a nested folder structure, matching the SemanticInsightsGenerator test structure
bfdc6ce to
98f91f7
Compare
What does this PR do?
Sets up the SemanticInsightsGenerator module structure with validation framework integration. LLM generation logic is stubbed out and ready for implementation.
Changes
Related Issues
Closes SparkBeyond/ao-core#113