Home

Caution

This is a work in progress!

AC/DC Model

Principles

Principle Classification Framework

Principles are classified into 5 categories based on their primary concern:

Category	Focus	Priority Level
A: Architectural	Structure, layers, dependencies	CRITICAL
B: Data & Analysis Integrity	Reproducibility, traceability, quality	CRITICAL
C: Standards Compliance	CDISC, regulatory, ICH	CRITICAL
D: Design Philosophy	Guiding values and approaches	FOUNDATIONAL
E: Operational	Tools, implementation, usage	SUPPORTING

Category A: Architectural Principles

These principles define the fundamental structure and organization of the AC/DC model.

A1: Layered Architecture with Unidirectional Dependencies

Statement: The AC/DC model SHALL maintain strict separation between layers with unidirectional dependency flow.

Three-Layer Structure:

Concepts: Abstract definitions of biomedical, derivation, and analysis concepts
Structures: Concrete data organization
Implementation: Transformations and presentations

Dependency Flow:

Implementations → Structures → Concepts
           (depend_on)  (depend_on)

Rules:

Concepts SHALL NOT reference structures or implementations
Structures SHALL ONLY reference concepts, NOT implementations
Implementations MAY reference both structures and concepts
Elements must belong to exactly one layer

Rationale:

Clear separation of concerns
Independent verification of each layer
Reusability across analyses
No circular dependencies
Maintainability

Priority: CRITICAL - This is the foundational architectural principle

Applicability: Universal - applies to all model elements

A2: Concept Independence from Study Context

Statement: Concepts SHALL be defined independent of specific studies, data standards, and implementation details.

Requirements:

Concepts describe real-world entities abstractly
Not context-dependent
Human-readable
Independent of SDTM, ADaM, or other standards

Contrast with Structures and Implementation:

Structures and Implementation are context-dependent
Machine-readable and executable
Instantiated in specific data standards

Rationale:

Reusability across studies
Conceptual clarity
Standards-agnostic reasoning

Priority: CRITICAL - Core to the conceptual/implementation separation

Applicability: Applies to all concept definitions

A3: Implementation Linkage to Standards

Statement: Every implementation SHALL link to concrete data standards (SDTM, ADaM) through structures and be machine-executable.

Requirements:

Implementations are representations of concepts in code
Must be machine-readable
Require concrete mappings
Must be executable by computational systems

Rationale:

Automation capability
Reduced transcription errors
Validation automation
Reuse of code

Priority: CRITICAL - Enables automation

Applicability: Applies to all implementation elements

A4: Clear Separation Between Analysis and Derivation

Statement: The model SHALL distinguish between subject-level derivations and non-subject-level analyses.

Definitions:

Derivation: Subject-level data handling to generate new subject-level data
Analysis: Creation of non-subject-level aggregated data

Rationale:

Different purposes require different approaches
Subject-level vs aggregate operations
Traceability requirements differ

Priority: HIGH - Important for scope clarity

Applicability: Applies to all methods and operations

A5: Cube-Based Data Organization

Statement: Data SHALL be organized as multi-dimensional cubes with dimensions, measures, and attributes.

Structure:

Dimensions: Identify and organize observations (subject, treatment, visit)
Measures: Quantitative or qualitative values being analyzed
Attributes: Qualifying metadata

Rationale:

Natural representation of clinical trial data
Supports OLAP-style operations
Aligns with ADaM BDS structure
Enables flexible slicing and aggregation

Priority: HIGH - Core structural pattern

Applicability: Applies to all data structures

A6: Slice-Based Subsetting

Statement: Data subsets SHALL be defined declaratively as slices by fixing dimension values or applying filters.

Requirements:

Slices are immutable views
Created by fixing one or more dimension values
Can represent subsets of records, variables, or both
May exist without methods (for analysis input)

Rationale:

Declarative specification
Reusability
Clear provenance
Separation of data selection from computation

Priority: HIGH - Fundamental operation

Applicability: Applies to all data subsetting

A7: Method Input/Output/Argument Structure

Statement: All methods SHALL explicitly declare inputs, outputs, and arguments.

Requirements:

Methods transform inputs to outputs
Arguments parameterize behavior
Inputs must be declared
Outputs must be declared
Arguments must be specified

Rationale:

Explicit interface definition
Type checking
Dependency analysis
Reproducibility

Priority: HIGH - Enables automated analysis

Applicability: Applies to all methods (derivation and analysis)

A8: Universal Connector Architecture

Note

PROPOSED by KWL - Pending review and approval. See: model/semantic-define-json-approach/WIKI_PROPOSAL_UNIVERSAL_CONNECTOR.md

Statement: DataConcepts SHALL serve as the universal abstraction layer connecting analytical structures to external domain models.

Requirements:

Cube elements (Dimensions, Measures, Attributes) SHALL reference DataConcepts via is_a relationships, NOT domain-specific variables directly
DataConcepts SHALL support simultaneous mappings to multiple external models (ADaM, SDTM, USDM, ARS, or proprietary)
Adding support for a new domain model SHALL NOT require changes to existing cube structures or templates
Templates SHALL be expressed in terms of DataConcepts, enabling portability across organizations
Execution engines SHALL resolve DataConcept references to domain-specific variables through explicit mapping declarations
The same DataConcept MAY map to different representations in different domain models

Architecture Diagram:

┌───────────┐   ┌───────────┐   ┌───────────┐   ┌───────────┐
│   ADaM    │   │   SDTM    │   │   USDM    │   │    ARS    │
│  USUBJID  │   │  SUBJID   │   │  Subject  │   │  Result   │
│   CHG     │   │           │   │           │   │           │
└───────────┘   └───────────┘   └───────────┘   └───────────┘
     EXTERNAL DOMAIN MODELS (Pluggable)
        ▲               ▲                ▲                ▲
        │               │                │                │
        │          maps_to          maps_to          maps_to
        │               │                │                │
┌─────────────────────────────────────────────────────────────────┐
│                      DATA CONCEPTS                              │
│              (Universal Abstraction Layer)                      │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  DataConcept: "subject"                                  │  │
│  │    description: "Study participant"                      │  │
│  │    mappings:                                             │  │
│  │      adam_variable: USUBJID                              │  │
│  │      sdtm_variable: SUBJID                               │  │
│  │      usdm_element: StudySubject                          │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
        ▲               ▲                ▲
        │               │                │
        │          ┌────┴────┐      ┌────┴────┐
        │          │  is_a   │      │  is_a   │
        │          └─────────┘      └─────────┘
        │               │                │
┌───────┼───────────────┼────────────────┼────────────────────────┐
│       │               │                │                        │
│  ┌────┴────┐    ┌─────┴─────┐    ┌─────┴─────┐                 │
│  │  Cube   │    │ Dimension │    │  Measure  │                 │
│  └─────────┘    └───────────┘    └───────────┘                 │
│                    ANALYTICAL STRUCTURES                        │
└─────────────────────────────────────────────────────────────────┘

Rationale:

Provides the mechanism (HOW) for achieving C1 and C4
Enables portable templates
Supports multi-standard compliance
Future-proofs against new standards
Decouples analytical logic from data standard specifics

Relationship to Other Principles:

Principle	Relationship
A1: Layered Architecture	DataConcepts ARE the Concepts layer; this principle specifies their connector role
A2: Concept Independence	This principle explains HOW concepts remain independent (via abstraction layer)
C1: CDISC Alignment	Universal Connector is the MECHANISM for achieving CDISC alignment
C4: Interoperability	Universal Connector ENABLES standards interoperability
D3: Progressive Refinement	Mappings can be added progressively (ADaM first, then SDTM, etc.)

Priority: CRITICAL - Foundational to achieving C1 (CDISC Alignment) and C4 (Interoperability)

Applicability: Universal - applies to all model elements

Category B: Data & Analysis Integrity Principles

These principles ensure reproducibility, quality, and trustworthiness of analyses.

B1: Analysis Reproducibility and Provenance

Source: Model_PRINCIPLES.md (GP-2)

Statement: The AC/DC model SHALL ensure reproducible analyses with clear metadata lineage through immutable entities and directed acyclic dependencies.

Immutability Rules:

Cubes are immutable - operations produce new cubes
Slices are immutable - views don't modify source
Methods produce new outputs - never modify inputs
No cube SHALL appear as both input and output of same method

DAG * Structure Rules:

Source cubes are roots (no dependencies)
Derived cubes depend on upstream cubes
Result cubes are downstream of derived cubes
No cube may depend (directly or transitively) on itself

* Directed Acyclic Graph - graph structure with no cycles

Rationale:

Clear data lineage
Reproducibility (same inputs → same outputs)
Inspectable intermediate results
Audit trail integrity
Testability

Priority: CRITICAL - Core to scientific integrity

Applicability: Universal - applies to all derivations and analyses

B2: Complete End-to-End Traceability

Statement: Every result SHALL be fully traceable backward through implementation, structure to source concepts.

Traceability Chain:

Display → Analysis Results → Method → Measure → Slice → Cube → Concept

Requirements:

All elements declare relationships
Orphaned elements are validation errors
Traceability extends to protocol objectives

Rationale:

Regulatory compliance (21 CFR Part 11, ICH E9)
Scientific reproducibility
Audit capability
Impact analysis
Quality assurance

Priority: CRITICAL - Regulatory requirement

Applicability: Universal - applies to all elements

B3: Explicit and Automated Quality Rules

Statement: Analysis checks, validation rules, and constraints SHALL be explicitly specified and automatically verified.

Key Requirements:

Rules documented declaratively in model
Rules automatically checked during validation
Rules versioned with analysis model

Examples:

Every analysis must reference at least one estimand
Analysis populations defined before use
Dependencies must be acyclic
Estimands must include all ICH E9(R1) components
P-values between 0 and 1
Baseline values exist for change-from-baseline analyses

Rationale:

Quality by design
Automation reduces QC burden
Transparency and auditability
Consistency across analyses
Traceability of violations

Priority: HIGH - Quality assurance

Applicability: Universal - applies to all model specifications

B4: Precision in Specification

Statement: Standardized structure SHALL enforce precision in specifying analysis and derivation settings and assumptions.

Goal: Reduced ambiguity through:

Formal structure
Explicit declarations
Type constraints
Validation rules

Rationale:

Reduces misinterpretation
Improves communication
Enables automation
Supports regulatory review

Priority: HIGH - Quality of specification

Applicability: Universal - applies to all specifications

B5: Derivation Concept Atomicity

Source: Principles.md (Derivation Concepts section)

Statement: Derivation concepts SHALL "do one thing and do it well" - complex derivations requiring sequences should be broken into multiple concepts.

Requirements:

Single responsibility per derivation concept
Sequences decomposed into atomic steps
Each step traceable independently

Derivation Concept Operations:

Update values for existing columns
Produce new columns with values
Produce new records
Combinations of the above

Rationale:

Clarity of purpose
Reusability
Testability
Easier debugging

Priority: MEDIUM - Design quality

Applicability: Applies to derivation concept definitions

B6: Modular Derivation Composition

Note

PROPOSED by KWL - Pending review and approval. This principle extends B5 (Atomicity) by addressing how atomic derivations compose into dependency chains.

Statement: Derivation concepts SHALL be composable through explicit dependency declarations, forming directed acyclic graphs (DAGs) that enable reuse across analyses.

Requirements:

Derivation Concepts SHALL declare their upstream dependencies (input DCs)
Derivation Concepts MAY be reused as inputs to multiple downstream DCs or ACs
Execution engines SHALL resolve dependencies and execute in topological order
The same DC template SHALL produce consistent results regardless of which downstream consumer uses it
Dependency declarations SHALL be explicit and machine-readable

Derivation Dependency Chain Example:

ANCOVA Analysis requires:
├── CHG (ChangeFromBaseline) ← depends on AVAL, BASE
│   ├── AVAL (AnalysisValue) ← depends on SDTM source
│   └── BASE (BaselineValue) ← depends on AVAL, ABLFL
│       └── ABLFL (BaselineFlag) ← depends on visit timing rules
├── PopulationFlags (EFFFL/ITTFL) ← depends on inclusion criteria
└── LOCFImputation ← depends on AVAL, time ordering

Composition Pattern:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    DERIVATION CONCEPT GRAPH (DAG)                           │
│                                                                             │
│   ┌──────────┐     ┌──────────┐                                            │
│   │  SDTM    │     │  Visit   │                                            │
│   │  Source  │     │  Timing  │                                            │
│   └────┬─────┘     └────┬─────┘                                            │
│        │                │                                                   │
│        ▼                ▼                                                   │
│   ┌──────────┐     ┌──────────┐                                            │
│   │   AVAL   │     │  ABLFL   │                                            │
│   │ (DC)     │     │  (DC)    │                                            │
│   └────┬─────┘     └────┬─────┘                                            │
│        │                │                                                   │
│        ├────────────────┤                                                   │
│        │                │                                                   │
│        ▼                ▼                                                   │
│   ┌──────────┐     ┌──────────┐                                            │
│   │  LOCF    │     │   BASE   │                                            │
│   │  (DC)    │     │   (DC)   │                                            │
│   └────┬─────┘     └────┬─────┘                                            │
│        │                │                                                   │
│        │                ├───────────────┐                                   │
│        │                │               │                                   │
│        │                ▼               ▼                                   │
│        │           ┌──────────┐    ┌──────────┐                            │
│        │           │   CHG    │    │  ANCOVA  │                            │
│        │           │   (DC)   │    │   (AC)   │                            │
│        │           └────┬─────┘    └──────────┘                            │
│        │                │               ▲                                   │
│        └────────────────┴───────────────┘                                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Benefits:

Reusability: Define AVAL once, use in BASE, CHG, LOCF, and multiple analyses
Consistency: Same derivation logic applied everywhere
Parallelization: Independent branches can execute concurrently
Provenance: Clear lineage from final analysis to source data
Testability: Each DC can be validated independently
Maintainability: Change in one DC propagates correctly to all dependents

Relationship to Other Principles:

Principle	Relationship
B1: Reproducibility	Composition through DAG ensures reproducible execution order
B2: Traceability	Dependency declarations provide explicit lineage
B5: Atomicity	B6 explains how atomic DCs (B5) combine into complete derivations
A4: Analysis/Derivation Separation	DCs compose to feed ACs at the boundary

Rationale:

Complex clinical derivations require multiple steps (SDTM → AVAL → BASE → CHG)
Without explicit composition, derivation logic is duplicated or inconsistent
DAG structure enables automated dependency resolution and execution planning
Reuse across analyses reduces errors and maintenance burden

Priority: HIGH - Enables practical implementation of atomic derivations (B5)

Applicability: Applies to all derivation concept definitions and their relationships

Category C: Standards Compliance Principles

These principles ensure alignment with industry and regulatory standards.

C1: CDISC Standards Alignment

Statement: The AC/DC model SHALL align with CDISC standards (USDM, SDTM, ADaM, ARS) while providing higher-level abstraction.

CDISC Standards Integration:

USDM (Unified Study Definitions Model)
- Protocol entities: study design, objectives, endpoints, estimands, populations
- Links through concepts
- Protocol specifications inform analysis planning
- Example: USDM StudyEndpoint → AC/DC AnalysisConcept
SDTM (Study Data Tabulation Model)
- Source data collection structure
- Links through structure entities (source cubes) via concepts
- Example: SDTM LB domain → AC/DC source cube
ADaM (Analysis Data Model)
- Analysis-ready dataset structure
- Links through structure entities via concepts
- Example: AC/DC Cube → ADaM BDS dataset, AC/DC Measure → ADaM AVAL
ARS (Analysis Results Standard)
- Analysis results metadata
- Links through derivation entities
- Example: AC/DC Method → ARS Analysis/AnalysisMethod
- Example: AC/DC Display → ARS OutputDisplay/Output

Requirements:

Source structures traceable to SDTM domains
Analysis structures mappable to ADaM
Concepts align with USDM protocol definitions
Displays mappable to ARS OutputDisplay

Rationale:

Industry interoperability
Regulatory acceptance
Ecosystem integration
Standards-based tooling

Priority: CRITICAL - Industry requirement

Applicability: Universal - applies to all model elements

C2: Regulatory Framework Compliance

Statement: The model SHALL support compliance with GxP, ICH guidelines, and regulatory requirements.

ICH E9(R1) Estimand Framework:

Treatment variable
Population
Variable (endpoint)
Population-level summary (intercurrent events + handling)
Rationale/interpretation

Additional Frameworks:

GxP Principles (Good Clinical, Laboratory, Manufacturing Practices)
ICH E3 (Clinical Study Reports)
ICH E9 (Statistical Principles)
21 CFR Part 11 (Electronic Records/Signatures)
ALCOA+ Data Integrity principles

Requirements:

Concepts support Estimand entities
Analysis specifications reference estimands explicitly
Validation verifies estimand completeness
Display metadata traces to protocol objectives

Rationale:

Health authority acceptance (FDA, EMA, PMDA)
Scientific rigor
Auditability
Protection from regulatory risk

Priority: CRITICAL - Regulatory requirement

Applicability: Applies to confirmatory trial analyses

C3: Scope Boundary Management

Statement: The model SHALL explicitly define what is in scope vs out of scope.

IN SCOPE:

Analysis and derivation concepts
Subject-level derivations
Analysis-level computations
Statistical methods
Data structures (cubes, dimensions, measures)
Traceability to concepts
Quality rules

OUT OF SCOPE:

Raw data collection (SDTM covers this)
Study design specification (USDM covers this)
Implementation details (programming syntax)
Infrastructure/platform specifics
Execution scheduling
User interface design

Rationale:

Clear boundaries prevent scope creep
Leverage existing standards where appropriate
Focus on model's unique value

Priority: HIGH - Prevents confusion

Applicability: Universal - defines model boundaries

C4: Standards Interoperability

Statement: Standardized structure SHALL allow interchange of specifications between systems and organizations.

Requirements:

Machine-readable format
Standard vocabulary
Common structure
Version control

Benefits:

Cross-organizational sharing
Tool interoperability
Reduced vendor lock-in
Community knowledge sharing

Rationale:

Industry efficiency
Best practice dissemination
Regulatory submissions

Priority: HIGH - Industry collaboration

Applicability: Applies to specification format and exchange

Category D: Design Philosophy Principles

These principles reflect the guiding values and approaches for the model.

D1: Declarative Analysis Specification

Statement: Analyses and derivations SHALL be specified declaratively ("what" not "how"), with implementations providing the "how."

Key Characteristics:

Specify intent, not procedure
Separate specification from implementation
Enable multiple implementations of same specification
Support validation independent of execution

Benefits:

Platform independence
Implementation flexibility
Easier verification
Clear intent

Rationale:

Conceptual clarity
Implementation alternatives
Testing and validation
Documentation quality

Priority: HIGH - Core design philosophy

Applicability: Applies to all specifications

D2: Keep It Simple, Stupid (KISS)

Statement: Whenever possible, simplicity is a design goal, though some complexity is unavoidable.

Approach:

Minimize unnecessary complexity
Use clear, simple structures when possible
Hide unavoidable complexity from end-users via tools
Prefer straightforward solutions

Rationale:

Easier adoption
Reduced errors
Better maintainability
Lower training burden

Priority: FOUNDATIONAL - Guiding principle

Applicability: Universal - applies to all design decisions

D3: Extensible Core with Progressive Refinement

Statement: The model SHALL provide minimal core elements with extension mechanisms for specialized needs.

Core Philosophy:

Minimal viable core
Extension points for specialization
Progressive disclosure of complexity
Backward compatibility

Example Extensions:

Study-specific concepts
Organization-specific methods
Therapeutic-area extensions
Custom quality rules

Rationale:

Broad applicability
Specialization support
Evolution without breaking changes
Community contribution

Priority: MEDIUM - Enables growth

Applicability: Applies to model evolution

D4: Common Language for Collaboration

Statement: The model SHALL provide common language between statisticians, clinicians, data managers, programmers, and stakeholders.

Stakeholder Benefits:

Statisticians: Clear analysis specification
Clinicians: Understanding of endpoints
Data Managers: Data structure clarity
Programmers: Unambiguous requirements
Regulators: Transparent documentation

Rationale:

Reduced miscommunication
Improved collaboration
Better quality
Faster development

Priority: HIGH - Team effectiveness

Applicability: Universal - affects all usage

Category E: Operational Principles

These principles guide implementation, tooling, and usage.

E1: LinkML as Modeling Language

Statement: LinkML SHALL be used as the modeling language instead of UML.

Rationale for LinkML:

Less complex than UML
Free, open tooling
Available for all operating systems
Easier visualization
No dependency on commercial software

Comparison with USDM:

USDM uses UML
UML requires Enterprise Architect (Windows-only, commercial)
UML complex and inconsistent visualization

Priority: MEDIUM - Tool choice

Applicability: Applies to model representation

E2: Machine Readability and Automation

Statement: Specifications SHALL be machine-readable to enable automated validation and code generation.

Capabilities Enabled:

Direct link to programming code
Automated validation
Reduced transcription errors
Code reuse
Quality checks

Rationale:

Error reduction
Efficiency gains
Consistency
Faster development

Priority: HIGH - Automation benefit

Applicability: Applies to all specifications

E3: Clear Linkage Through Model Layers

Statement: The model SHALL provide clear linkage from results to objectives/endpoints in USDM and vice versa.

Traceability Path:

USDM Objective → Endpoint → Estimand →
Analysis Concept → Method → Display → Results

Benefits:

Regulatory clarity
Protocol alignment verification
Change impact analysis
Audit trail

Priority: HIGH - Regulatory necessity

Applicability: Universal - affects all elements

E4: Tool-Hidden Complexity

Statement: Unavoidable complexity SHALL be hidden from end-users by tools.

Approach:

User-friendly interfaces
Reasonable defaults
Progressive disclosure
Expert mode for advanced users

Rationale:

Accessibility
Reduced learning curve
Error prevention
User satisfaction

Priority: MEDIUM - User experience

Applicability: Applies to tooling and interfaces

Caution

Anything below this message is not yet reviewed.

Open Questions

1. Analysis Concept Definition

Important

Question Can we define Analysis Concept?
Suggested answer A specification of a single analysis computation producing aggregated results from input data

2. Analysis-Only Slices

Important

Question Should model allow slices with no method that are only used by Analysis Concepts?
Suggested answer Yes, allow - not all slices need derivations, some are just selection criteria for analysis inputs

3. Main Goals

Important

Question What are the main goals of the AC/DC model?
Suggested answer

Enable reproducible clinical trial analyses
Provide standard interchange format
Support regulatory compliance
Facilitate cross-organizational collaboration

Home

AC/DC Model

Principles

Principle Classification Framework

Category A: Architectural Principles

A1: Layered Architecture with Unidirectional Dependencies

A2: Concept Independence from Study Context

A3: Implementation Linkage to Standards

A4: Clear Separation Between Analysis and Derivation

A5: Cube-Based Data Organization

A6: Slice-Based Subsetting

A7: Method Input/Output/Argument Structure

A8: Universal Connector Architecture

Category B: Data & Analysis Integrity Principles

B1: Analysis Reproducibility and Provenance

B2: Complete End-to-End Traceability

B3: Explicit and Automated Quality Rules

B4: Precision in Specification

B5: Derivation Concept Atomicity

B6: Modular Derivation Composition

Category C: Standards Compliance Principles

C1: CDISC Standards Alignment

C2: Regulatory Framework Compliance

C3: Scope Boundary Management

C4: Standards Interoperability

Category D: Design Philosophy Principles

D1: Declarative Analysis Specification

D2: Keep It Simple, Stupid (KISS)

D3: Extensible Core with Progressive Refinement

D4: Common Language for Collaboration

Category E: Operational Principles

E1: LinkML as Modeling Language

E2: Machine Readability and Automation

E3: Clear Linkage Through Model Layers

E4: Tool-Hidden Complexity

Open Questions

1. Analysis Concept Definition

2. Analysis-Only Slices

3. Main Goals

Clone this wiki locally