-
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Summary
Implement streaming execution backend for PSDL using Apache Flink (PyFlink) as proposed in RFC-0002.
RFC Reference
See rfcs/0002-streaming-execution.md for full specification.
Current Status: ✅ Phase 1 Complete
Phase 1: Core Streaming (v0.2.0) - COMPLETE
- Project structure (
src/psdl/runtimes/streaming/) - PSDL to Flink compiler infrastructure
- Operator compilation:
delta,slope,ema,last,min,max,count,sma - Configurable slide intervals
- KeyedStream by patient_id
- Logic join functions (CoProcessFunction)
- Basic streaming models and evaluator
- Unit tests (test_streaming.py - all passing)
Phase 2: Production Readiness (v0.3.x) - PLANNED
- Kafka source/sink connectors
- FHIR subscription source
- Trigger/CEP compilation
- State TTL management
- Checkpointing configuration
- Monitoring/metrics
- Kubernetes deployment
- Documentation
Phase 3: Advanced Features (v0.4.x) - FUTURE
- Savepoint management
- Schema evolution support
- Multi-patient logic (outbreak detection)
- Performance benchmarks
Key Design Decisions
- Runtime: PyFlink (consistency with Python codebase)
- Slide intervals: Configurable per operator
- Late data: Default
max_lateness: 5m - Architecture: Follows RFC-0003 modular structure
Related
- RFC-0003: Architecture Refactor (defines runtime structure)
- RFC-0005: PSDL v0.3 Architecture (current spec)