An Open Standard for Clinical Logic, Real-Time Monitoring & AI Integration
What SQL became for data queries, ONNX for ML models, and GraphQL for APIs —
PSDL is becoming the semantic layer for clinical AI.
📄 Read the Whitepaper: English · 简体中文 · Español · Français · 日本語
Clinical AI doesn't fail because models are weak.
It fails because decisions cannot be traced.
PSDL makes every clinical decision accountable:
| WHO wrote this logic? |
WHY does it matter? |
WHAT evidence supports it? |
audit.intent |
audit.rationale |
audit.provenance |
This is not optional. This is what makes PSDL regulatory-ready (FDA, EU MDR).
Run PSDL in your browser with Google Colab - zero installation, real clinical data:
| Notebook | Data | Description |
|---|---|---|
| Synthetic | Quick demo with generated patient data (2 min) | |
| MIMIC-IV Demo | 100 real ICU patients, ICD diagnoses | |
| PhysioNet Sepsis | 40,000+ patients with labeled sepsis |
Despite significant advances in clinical AI and machine learning, real-time decision support in healthcare remains fragmented, non-portable, non-reproducible, and exceptionally difficult to audit or regulate.
PSDL bridges the determination, audit, and portability gaps in clinical research
PSDL (Patient Scenario Definition Language) is a declarative, vendor-neutral language for expressing clinical scenarios. It provides a structured way to define:
| Component | Description |
|---|---|
| Signals | Time-series clinical data bindings (labs, vitals, etc.) |
| Trends | Temporal computations over signals (deltas, slopes, averages) |
| Logic | Boolean algebra combining trends into clinical states |
| Audit | Required traceability (intent, rationale, provenance) |
| Population | Criteria for which patients a scenario applies to |
| State | Optional state machine for tracking clinical progression |
Syntax vs Semantics vs Runtime - How PSDL Works
# Detect early kidney injury (v0.3 syntax)
scenario: AKI_Early_Detection
version: "0.3.0"
audit:
intent: "Detect early acute kidney injury using creatinine trends"
rationale: "Early AKI detection enables timely intervention"
provenance: "KDIGO Clinical Practice Guideline for AKI (2012)"
signals:
Cr:
ref: creatinine # v0.3: 'ref' instead of 'source'
concept_id: 3016723 # OMOP concept
unit: mg/dL
trends:
# v0.3: Trends produce numeric values only
cr_delta_6h:
expr: delta(Cr, 6h)
description: "Creatinine change over 6 hours"
cr_current:
expr: last(Cr)
description: "Current creatinine value"
logic:
# v0.3: Comparisons belong in logic layer
cr_rising:
when: cr_delta_6h > 0.3
description: "Rising creatinine (> 0.3 mg/dL in 6h)"
cr_high:
when: cr_current > 1.5
description: "Elevated creatinine"
aki_risk:
when: cr_rising AND cr_high
severity: high
description: "Early AKI - rising and elevated creatinine"| Challenge | Without PSDL | With PSDL |
|---|---|---|
| Portability | Logic tied to specific hospital systems | Same scenario runs anywhere with mapping |
| Auditability | Scattered across Python, SQL, configs | Single structured, version-controlled file |
| Reproducibility | Hidden state, implicit dependencies | Deterministic execution, explicit semantics |
| Regulatory Compliance | Manual documentation | Built-in audit primitives |
| Research Sharing | Cannot validate published scenarios | Portable, executable definitions |
# Install from PyPI
pip install psdl-lang
# With OMOP adapter support
pip install psdl-lang[omop]
# With FHIR adapter support
pip install psdl-lang[fhir]
# Full installation (all adapters)
pip install psdl-lang[full]from psdl.core import parse_scenario
# Use bundled scenarios (included with pip install)
from psdl.examples import get_scenario, list_scenarios
# List available scenarios
print(list_scenarios()) # ['aki_detection', 'sepsis_screening', ...]
# Load a bundled scenario
scenario = get_scenario("aki_detection")
print(f"Scenario: {scenario.name}")
print(f"Signals: {list(scenario.signals.keys())}")
print(f"Logic rules: {list(scenario.logic.keys())}")from psdl.examples import get_scenario
from psdl.runtimes.single import SinglePatientEvaluator, InMemoryBackend
from datetime import datetime, timedelta
# Load bundled scenario
scenario = get_scenario("aki_detection")
# Set up data backend
backend = InMemoryBackend()
now = datetime.now()
# Add patient data (using convenience method)
backend.add_observation(123, "Cr", 1.0, now - timedelta(hours=6))
backend.add_observation(123, "Cr", 1.3, now - timedelta(hours=3))
backend.add_observation(123, "Cr", 1.8, now)
# Evaluate
evaluator = SinglePatientEvaluator(scenario, backend)
result = evaluator.evaluate(patient_id=123, reference_time=now)
if result.is_triggered:
print(f"Patient triggered: {result.triggered_logic}")
print(f"Trend values: {result.trend_values}")For production deployments requiring audit trails, use compile_scenario to get a compiled IR with cryptographic hashes:
from psdl.core.compile import compile_scenario
from psdl.runtimes.single import SinglePatientEvaluator
# Compile scenario to IR with audit hashes
ir = compile_scenario("scenario.yaml")
print(f"Spec Hash: {ir.spec_hash}") # Hash of input YAML
print(f"IR Hash: {ir.ir_hash}") # Hash of compiled IR
print(f"Toolchain: {ir.toolchain_hash}") # Hash of compiler version
# Create evaluator from compiled IR
evaluator = SinglePatientEvaluator.from_ir(ir, backend)
result = evaluator.evaluate(patient_id=123, reference_time=now)
# Results include compilation hashes for audit
print(f"Compilation Hashes: {result.compilation_hashes}")The compiled IR includes:
- DAG-ordered evaluation - Dependencies computed once, evaluated in correct order
- Canonical hashing - Reproducible SHA-256 hashes per
spec/hashing.yaml - Compilation diagnostics - Warnings for unused signals/trends
- Audit trail - Full traceability from YAML to evaluation
Dataset Specs enable portable scenarios by mapping semantic signal references to physical data locations:
from psdl import load_dataset_spec
# Load institution-specific binding
spec = load_dataset_spec("dataset_specs/my_hospital_omop.yaml")
# Resolve a signal reference to physical binding
binding = spec.resolve("creatinine")
print(binding.table) # "measurement"
print(binding.filter_expr) # "concept_id = 3016723"This separates clinical logic (portable scenarios) from local terminology (institution-specific mappings).
| Operator | Syntax | Description |
|---|---|---|
delta |
delta(signal, window) |
Absolute change over window |
slope |
slope(signal, window) |
Linear regression slope |
ema |
ema(signal, window) |
Exponential moving average |
sma |
sma(signal, window) |
Simple moving average |
min |
min(signal, window) |
Minimum value in window |
max |
max(signal, window) |
Maximum value in window |
count |
count(signal, window) |
Observation count |
last |
last(signal) |
Most recent value |
30s- 30 seconds5m- 5 minutes6h- 6 hours1d- 1 day7d- 7 days
PSDL follows industry-standard patterns (like GraphQL, CQL, ONNX):
- Specification defines WHAT
- Reference Implementation shows HOW.
psdl/
├── README.md # This file
├── PRINCIPLES.md # Core laws defining PSDL scope
├── spec/ # SPECIFICATION (Source of Truth)
│ ├── schema.json # JSON Schema for scenarios
│ ├── operators.yaml # Operator definitions
│ └── grammar/ # Lark/EBNF grammars
├── src/psdl/ # REFERENCE IMPLEMENTATION (Python)
│ ├── __init__.py # Package entry point
│ ├── operators.py # Temporal operators (shared)
│ ├── core/ # Core module (parsing, IR) - v0.3 strict mode
│ │ ├── parser.py # YAML parser
│ │ └── ir.py # Intermediate representation
│ ├── runtimes/ # Execution runtimes
│ │ ├── single/ # Single patient evaluation
│ │ └── cohort/ # Cohort SQL compilation
│ ├── adapters/ # Data adapters (OMOP, FHIR, PhysioNet)
│ └── examples/ # Bundled scenarios (7 scenarios)
├── examples/ # Demo content (not in package)
│ ├── notebooks/ # Jupyter demos (5 notebooks, Colab-ready)
│ └── data/ # Sample data (compressed archives)
├── docs/ # Documentation + Whitepapers
├── rfcs/ # Design proposals (5 RFCs)
└── tests/ # 369 tests (all passing)
| Component | Description |
|---|---|
| Specification | PSDL language definition (see WHITEPAPER.md) |
| Reference Implementation | Python implementation demonstrating the spec |
| Core | Parser, IR types, expression parsing |
| Runtimes | Single patient, cohort SQL, streaming execution |
| Adapters | Data sources (OMOP, FHIR, PhysioNet) |
PSDL follows a spec-driven architecture where specification files are the single source of truth. Code is auto-generated from specs to eliminate redundancy and ensure consistency.
┌─────────────────────────────────────────────────────────────────┐
│ SPEC FILES (Source of Truth) │
├─────────────────────────────────────────────────────────────────┤
│ spec/schema.json → Scenario YAML structure │
│ spec/ast-nodes.yaml → Expression AST types + grammar mappings│
│ spec/operators.yaml → Operator metadata + SQL templates │
│ spec/grammar/*.lark → Expression grammar (Lark) │
└─────────────────────────────────────────────────────────────────┘
│
▼ python tools/codegen.py --all
┌─────────────────────────────────────────────────────────────────┐
│ AUTO-GENERATED CODE │
├─────────────────────────────────────────────────────────────────┤
│ _generated/schema_types.py ← schema.json (datamodel-codegen) │
│ _generated/ast_types.py ← ast-nodes.yaml (Jinja2) │
│ _generated/transformer.py ← ast-nodes.yaml grammar_mappings │
│ _generated/operators_meta.py ← operators.yaml (Jinja2) │
│ _generated/sql_templates.py ← operators.yaml (Jinja2) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ MANUAL CODE (Algorithms Only) │
├─────────────────────────────────────────────────────────────────┤
│ operators.py → Temporal operator implementations │
│ runtimes/ → Execution engines (single, cohort) │
│ adapters/ → Data source adapters (OMOP, FHIR) │
└─────────────────────────────────────────────────────────────────┘
# Regenerate all code from specs
python tools/codegen.py --all
# Regenerate specific components
python tools/codegen.py --types # Python types from schema.json
python tools/codegen.py --ast # AST types from ast-nodes.yaml
python tools/codegen.py --transformer # Lark transformer from grammar_mappings
python tools/codegen.py --operators # Operator metadata
python tools/codegen.py --sql # SQL templates
# Validate implementations against spec
python tools/codegen.py --validate| Benefit | Description |
|---|---|
| Single Source of Truth | Specs define types once; code is generated |
| Consistency | No manual type mismatches or drift |
| Maintainability | Change spec → regenerate → done |
| Auditability | Clear traceability from spec to implementation |
# Run all tests
pytest tests/ -v
# Run with verbose output
pytest tests/ -v -s- Unit Tests: Parser, evaluator, operators, scenarios
- Integration Tests: FHIR adapter, OMOP backend, PhysioNet adapter
- Validation: SQL equivalence (100% match), KDIGO clinical guidelines
- Streaming Tests: Window functions, logic evaluation, Flink compiler
- Cohort Tests: SQL compilation, batch evaluation
- Compiler Tests: ScenarioIR, DAG ordering, canonical hashing
See tests/TEST_VALIDATION.md for detailed methodology.
| Scenario | Description | Clinical Use |
|---|---|---|
| ICU Deterioration | Monitors for early signs of clinical deterioration | Kidney function, lactate trends, hemodynamics |
| AKI Detection | KDIGO criteria for Acute Kidney Injury staging | Creatinine-based staging |
| Sepsis Screening | qSOFA + lactate-based sepsis screening | Early sepsis identification |
| PhysioNet Sepsis | Sepsis-3 criteria for PhysioNet Challenge 2019 | SIRS + organ dysfunction |
The Core Law: PSDL defines WHAT to detect, not HOW to collect or execute.
For the full set of laws governing PSDL's scope, see PRINCIPLES.md
| Principle | Description |
|---|---|
| Declarative | Define what to detect, not how to compute it |
| Portable | Same scenario runs on any OMOP/FHIR backend with mapping |
| Auditable | Structured format enables static analysis and version control |
| Deterministic | Predictable execution with no hidden state |
| Open | Vendor-neutral, community-governed |
| Phase | Status | Focus |
|---|---|---|
| Phase 1: Semantic Foundation | ✅ Complete | Spec, parser, operators, OMOP/FHIR adapters |
| Phase 2: v0.3 Architecture | ✅ Complete | Signal/Trend/Logic/Output separation, PyPI publication |
| Phase 3: Production Readiness | 🚧 Current | Output profiles, streaming, performance |
| Phase 4: Adoption & Scale | 🔮 Future | Hospital pilots, standards engagement |
| Standard | Relationship |
|---|---|
| OMOP CDM | Data model for signals (concept_id references) |
| FHIR R4 | EHR integration (implemented adapter) |
| CQL | Similar domain, different scope (quality measures) |
| ONNX | Inspiration for portable format approach |
| Document | Description |
|---|---|
| API Reference | Developer API documentation |
| Principles | The laws defining PSDL's scope and boundaries |
| Whitepaper | Full project vision and specification (5 languages) |
| Getting Started | Quick start guide |
| Roadmap | Development phases and timeline |
| Changelog | Version history |
We welcome contributions! See CONTRIBUTING.md for guidelines.
- Specification: Propose language features, operators, semantics
- Implementation: Build runtimes, backends, tooling
- Documentation: Improve guides, tutorials, examples
- Testing: Add conformance tests, find edge cases
- Adoption: Share use cases, pilot experiences
Apache 2.0 - See LICENSE for details.
Clinical AI doesn't fail because models are weak.
It fails because there's no semantic layer to express clinical logic portably.
PSDL is the semantic layer for clinical AI — like SQL for databases.
An open standard built by the community, for the community.
