Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@ option(ZEN_ENABLE_BUILTIN_LIBC "Enable builtin libc (partial)" ON)
option(ZEN_ENABLE_LIBEVM "Enable evmc library build" OFF)

# Feature options
option(
ZEN_ENABLE_JIT_PRECOMPILE_FALLBACK
"Enable interpreter fallback before JIT compilation for bytecode estimated to be too expensive"
ON
)
option(ZEN_ENABLE_CPU_EXCEPTION "Enable cpu trap to implement wasm trap" ON)
option(ZEN_ENABLE_VIRTUAL_STACK "Enable virtual stack(no system stack)" OFF)
option(ZEN_ENABLE_DUMP_CALL_STACK "Enable exception call stack dump" OFF)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
## Context

DTVM's multipass JIT compiles EVM bytecode via an MIR pipeline that expands certain opcodes into long SelectInstruction chains (e.g., SHL produces ~92 Selects per call). When hundreds of such opcodes appear in a single basic block, the greedy register allocator's cost becomes superlinear, causing compilation times to explode from milliseconds to minutes.

Two distinct pathological patterns have been identified:
- **b0 (DUP feedback)**: `DUP1 SHL DUP1 SHL ...` -- the shift result feeds back as both operands, creating exponentially overlapping live ranges
- **b1 (full stack)**: `DUP1 x1000 SHL x1000` -- massive fan-out of a single value across the entire function

A DUP detection fix (`Shift == Value` in `handleShift`) already mitigates b0 at the MIR level. This proposal addresses the remaining cases by detecting pathological patterns before compilation begins, avoiding the expensive JIT path entirely.

## Goals / Non-Goals

- Goals:
- Detect bytecodes that would cause RA explosion before JIT compilation starts
- Zero overhead on normal contracts (analysis is O(n) in bytecode length, piggybacks on existing scan)
- Configurable thresholds to tune false-positive/negative tradeoff
- Replace the existing flat `MIR_OPCODE_WEIGHT` estimate with a structured, pattern-aware analysis
- Non-Goals:
- Fixing the register allocator itself (separate effort)
- Detecting runtime-only pathologies (e.g., infinite loops)
- Handling singlepass JIT (only multipass is affected)

## Decisions

- **Integration into EVMAnalyzer::analyze()**: The analyzer already scans all opcodes with block boundary detection. Adding ~5 comparisons per opcode is negligible. This avoids a second pass and keeps the analysis colocated with related bytecode metadata.
- **Not integrated into evm_cache.cpp**: The cache focuses on gas metering (SPP) with a different block model (gas chunks vs compilation blocks). Mixing JIT analysis here would conflate concerns.
- **Struct-based result**: `JITSuitabilityResult` provides fine-grained metrics (not just a boolean), enabling callers to log diagnostics, tune thresholds, or implement graduated responses.

## RA-Expensive Opcode Set

Based on empirical analysis of MIR expansion and Select chain density:

| Opcode | Selects/call | Total MIR/call | Justification |
|--------|-------------|----------------|---------------|
| SHL (0x1b) | 92 | ~150-180 | Nested J,K loops over 4 U256 components |
| SHR (0x1c) | 96 | ~160-190 | Same structure as SHL |
| SAR (0x1d) | 52 | ~100-130 | Similar but with sign extension |
| MUL (0x02) | 0 | ~50-60 | Heavy inline U256 mul (no Selects but huge VR fan-out) |
| SIGNEXTEND (0x0b) | 21 | ~80-100 | Two dependency chain loops |

## Detection Heuristics

1. **Per-block density**: Count RA-expensive opcodes per basic block (JUMPDEST to JUMP/STOP/RETURN). Normal contracts have <20 per block; pathological cases have 500+.
2. **Consecutive run length**: Track the longest unbroken sequence of RA-expensive opcodes (DUPs/SWAPs are transparent since they don't generate heavy MIR). Detects both b0 and b1 patterns.
3. **DUP feedback count**: Count `DUPn immediately followed by RA-expensive op` pairs. This specifically targets the b0 pattern where DUP creates the feedback loop.

## Thresholds (initial, tunable)

- `MAX_CONSECUTIVE_RA_EXPENSIVE = 128` -- safe margin above any real contract
- `MAX_BLOCK_RA_EXPENSIVE = 256` -- per-block cap
- `MAX_DUP_FEEDBACK_PATTERN = 64` -- DUP+expensive pairs in whole bytecode
- Existing: `MAX_JIT_BYTECODE_SIZE = 0x6000`, `MAX_JIT_MIR_ESTIMATE = 50000`

## Risks / Trade-offs

- **False positives**: A contract with 129 consecutive MULs would trigger fallback even if compilation would succeed. Mitigation: thresholds are set conservatively high (real contracts have <20 per block).
- **False negatives**: Novel pathological patterns not involving the listed opcodes could still cause RA explosion. Mitigation: the existing `MAX_JIT_MIR_ESTIMATE` serves as a backstop.
- **Maintenance cost**: New RA-expensive opcodes added in the future must be added to the set. Mitigation: the set is small and well-documented.

## Open Questions

- Should the thresholds be runtime-configurable (e.g., via `set_option`) or compile-time only?
- Should the analysis result be cached in `EVMBytecodeCache` for reuse between interpreter and JIT paths?
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Change: Add JIT suitability checker for EVM bytecode

## Why

EVM bytecodes containing high concentrations of RA-expensive opcodes (SHL, SHR, SAR, MUL, SIGNEXTEND) cause the greedy register allocator to exhibit superlinear (O(n^2)) compilation time, hanging for minutes or triggering OOM kills in CI. The current fallback mechanism uses a flat linear MIR estimate that cannot distinguish pathological patterns from normal contracts with similar opcode counts.

## What Changes

- Add a pattern-aware JIT suitability analysis integrated into `EVMAnalyzer::analyze()` that detects:
- Per-block concentration of RA-expensive opcodes
- Consecutive runs of RA-expensive opcodes (ignoring interleaved DUPs/SWAPs)
- DUP feedback patterns (DUPn immediately followed by an RA-expensive op)
- Replace the existing `MIR_OPCODE_WEIGHT[]` table and `estimateMirInstructionCount()` in `dt_evmc_vm.cpp` with a structured `JITSuitabilityResult` from the analyzer
- Expose configurable thresholds for fallback decisions

## Impact

- Affected specs: `evm-jit`
- Affected code:
- `src/compiler/evm_frontend/evm_analyzer.h` (extend analysis loop)
- `src/vm/dt_evmc_vm.cpp` (replace fallback decision logic)
- `src/CMakeLists.txt` (include path if needed)
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
## ADDED Requirements

### Requirement: JIT suitability analysis before compilation
The system SHALL analyze EVM bytecode for patterns that cause register allocation explosion before attempting JIT compilation, and SHALL fall back to interpreter mode when pathological patterns are detected.

#### Scenario: Normal contract passes suitability check
- **WHEN** EVM bytecode contains fewer than 128 consecutive RA-expensive opcodes per run
- **AND** fewer than 256 RA-expensive opcodes per basic block
- **AND** fewer than 64 DUP-feedback patterns
- **AND** the linear MIR estimate is below the configured threshold
- **THEN** the system SHALL proceed with JIT compilation

#### Scenario: High consecutive RA-expensive opcode density triggers fallback
- **WHEN** EVM bytecode contains a run of more than 128 consecutive RA-expensive opcodes (SHL, SHR, SAR, MUL, SIGNEXTEND), with DUP and SWAP opcodes not breaking the run
- **THEN** the system SHALL fall back to interpreter mode for that contract
- **AND** the system SHALL log the fallback reason with the detected pattern metrics

#### Scenario: High per-block RA-expensive opcode density triggers fallback
- **WHEN** a single basic block (JUMPDEST to control-flow terminator) contains more than 256 RA-expensive opcodes
- **THEN** the system SHALL fall back to interpreter mode for that contract

#### Scenario: DUP feedback loop pattern triggers fallback
- **WHEN** EVM bytecode contains more than 64 instances of DUPn immediately followed by an RA-expensive opcode
- **THEN** the system SHALL fall back to interpreter mode for that contract

#### Scenario: Suitability analysis performance
- **WHEN** the suitability analysis runs on any EVM bytecode
- **THEN** the analysis SHALL complete in O(n) time where n is the bytecode length
- **AND** the analysis SHALL not allocate heap memory proportional to bytecode size beyond existing analyzer structures

### Requirement: RA-expensive opcode classification
The system SHALL classify EVM opcodes that expand to complex MIR structures (long Select chains or heavy intermediate value fan-out) as RA-expensive for the purpose of JIT suitability analysis.

#### Scenario: Shift opcodes classified as RA-expensive
- **WHEN** classifying opcodes for JIT suitability
- **THEN** SHL (0x1b), SHR (0x1c), and SAR (0x1d) SHALL be classified as RA-expensive
- **AND** each generates 52-96 SelectInstruction chains per invocation in MIR

#### Scenario: Multiplication classified as RA-expensive
- **WHEN** classifying opcodes for JIT suitability
- **THEN** MUL (0x02) SHALL be classified as RA-expensive
- **AND** it generates ~50-60 MIR instructions with heavy intermediate value fan-out

#### Scenario: Sign extension classified as RA-expensive
- **WHEN** classifying opcodes for JIT suitability
- **THEN** SIGNEXTEND (0x0b) SHALL be classified as RA-expensive
- **AND** it generates ~21 SelectInstruction chains per invocation in MIR

## MODIFIED Requirements

### Requirement: Multipass-only EVM JIT support
The system SHALL compile EVM bytecode using the multipass JIT pipeline only, after verifying bytecode suitability through pattern analysis.

#### Scenario: Multipass eager compilation
- **WHEN** runtime mode is Multipass
- **AND** the bytecode passes JIT suitability analysis
- **THEN** the system SHALL eagerly compile EVM bytecode using the EVM JIT compiler

#### Scenario: Multipass fallback to interpreter
- **WHEN** runtime mode is Multipass
- **AND** the bytecode fails JIT suitability analysis
- **THEN** the system SHALL temporarily switch to interpreter mode for that execution
- **AND** the system SHALL log the fallback with diagnostic metrics

#### Scenario: Lazy compilation unsupported
- **WHEN** runtime configuration requests lazy JIT for EVM
- **THEN** the system SHALL emit a warning and skip lazy compilation

#### Scenario: Singlepass mode unsupported
- **WHEN** runtime mode is Singlepass
- **THEN** the system SHALL emit an error indicating EVMJIT is unsupported
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
## 1. JIT Suitability Analysis in EVMAnalyzer

- [x] 1.1 Define `JITSuitabilityResult` struct in `evm_analyzer.h` with fields: `ShouldFallback`, `MirEstimate`, `RAExpensiveCount`, `MaxConsecutiveExpensive`, `MaxBlockExpensiveCount`, `DupFeedbackPatternCount`
- [x] 1.2 Add `isRAExpensiveOpcode()` helper function covering SHL, SHR, SAR, MUL, SIGNEXTEND
- [x] 1.3 Add per-opcode MIR weight table (migrated from `dt_evmc_vm.cpp`) for linear MIR estimate
- [x] 1.4 Extend `EVMAnalyzer::analyze()` loop to track: consecutive RA-expensive run length, per-block RA-expensive count, DUP feedback pattern detection, MIR estimate accumulation
- [x] 1.5 Add `shouldFallbackJIT()` method combining all thresholds into a single boolean
- [x] 1.6 Add `getJITSuitability()` accessor returning the result struct

## 2. Integration into EVMC VM Execute Path

- [x] 2.1 Include `evm_analyzer.h` from `dt_evmc_vm.cpp` (verify include paths)
- [x] 2.2 Replace `MIR_OPCODE_WEIGHT[]` table and `estimateMirInstructionCount()` with `EVMAnalyzer::analyze()` + `getJITSuitability()`
- [x] 2.3 Update fallback decision in `execute()` to use `JITSuitabilityResult::ShouldFallback`
- [x] 2.4 Add diagnostic logging for fallback triggers (opcode pattern type, counts)

## 3. Verification

- [x] 3.1 Build and verify compilation succeeds in Release mode
- [x] 3.2 Run SHL/SHR/SAR benchmark: verify pathological cases trigger fallback, normal cases do not
- [x] 3.3 Run full benchmark suite: verify no OOM, no hangs, no false-positive fallbacks on real contract benchmarks
57 changes: 55 additions & 2 deletions openspec/specs/evm-jit/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,21 @@

## Purpose
Define DTVM’s multipass JIT compilation pipeline for EVM bytecode, including compilation constraints, code emission, and runtime integration.

## Requirements
### Requirement: Multipass-only EVM JIT support
The system SHALL compile EVM bytecode using the multipass JIT pipeline only.
The system SHALL compile EVM bytecode using the multipass JIT pipeline only, after verifying bytecode suitability through pattern analysis.

#### Scenario: Multipass eager compilation
- **WHEN** runtime mode is Multipass
- **AND** the bytecode passes JIT suitability analysis
- **THEN** the system SHALL eagerly compile EVM bytecode using the EVM JIT compiler

#### Scenario: Multipass fallback to interpreter
- **WHEN** runtime mode is Multipass
- **AND** the bytecode fails JIT suitability analysis
- **THEN** the system SHALL temporarily switch to interpreter mode for that execution
- **AND** the system SHALL log the fallback with diagnostic metrics

#### Scenario: Lazy compilation unsupported
- **WHEN** runtime configuration requests lazy JIT for EVM
- **THEN** the system SHALL emit a warning and skip lazy compilation
Expand Down Expand Up @@ -56,3 +62,50 @@ The system SHALL record compilation timing and optionally emit perf JIT dump sym
#### Scenario: Perf JIT dump output
- **WHEN** Linux perf JIT dumping is enabled
- **THEN** the compiler SHALL emit per-block symbols for generated code

### Requirement: JIT suitability analysis before compilation
The system SHALL analyze EVM bytecode for patterns that cause register allocation explosion before attempting JIT compilation, and SHALL fall back to interpreter mode when pathological patterns are detected.

#### Scenario: Normal contract passes suitability check
- **WHEN** EVM bytecode contains fewer than 128 consecutive RA-expensive opcodes per run
- **AND** fewer than 256 RA-expensive opcodes per basic block
- **AND** fewer than 64 DUP-feedback patterns
- **AND** the linear MIR estimate is below the configured threshold
- **THEN** the system SHALL proceed with JIT compilation

#### Scenario: High consecutive RA-expensive opcode density triggers fallback
- **WHEN** EVM bytecode contains a run of more than 128 consecutive RA-expensive opcodes (SHL, SHR, SAR, MUL, SIGNEXTEND), with DUP and SWAP opcodes not breaking the run
- **THEN** the system SHALL fall back to interpreter mode for that contract
- **AND** the system SHALL log the fallback reason with the detected pattern metrics

#### Scenario: High per-block RA-expensive opcode density triggers fallback
- **WHEN** a single basic block (JUMPDEST to control-flow terminator) contains more than 256 RA-expensive opcodes
- **THEN** the system SHALL fall back to interpreter mode for that contract

#### Scenario: DUP feedback loop pattern triggers fallback
- **WHEN** EVM bytecode contains more than 64 instances of DUPn immediately followed by an RA-expensive opcode
- **THEN** the system SHALL fall back to interpreter mode for that contract

#### Scenario: Suitability analysis performance
- **WHEN** the suitability analysis runs on any EVM bytecode
- **THEN** the analysis SHALL complete in O(n) time where n is the bytecode length
- **AND** the analysis SHALL not allocate heap memory proportional to bytecode size beyond existing analyzer structures

### Requirement: RA-expensive opcode classification
The system SHALL classify EVM opcodes that expand to complex MIR structures (long Select chains or heavy intermediate value fan-out) as RA-expensive for the purpose of JIT suitability analysis.

#### Scenario: Shift opcodes classified as RA-expensive
- **WHEN** classifying opcodes for JIT suitability
- **THEN** SHL (0x1b), SHR (0x1c), and SAR (0x1d) SHALL be classified as RA-expensive
- **AND** each generates 52-96 SelectInstruction chains per invocation in MIR

#### Scenario: Multiplication classified as RA-expensive
- **WHEN** classifying opcodes for JIT suitability
- **THEN** MUL (0x02) SHALL be classified as RA-expensive
- **AND** it generates ~50-60 MIR instructions with heavy intermediate value fan-out

#### Scenario: Sign extension classified as RA-expensive
- **WHEN** classifying opcodes for JIT suitability
- **THEN** SIGNEXTEND (0x0b) SHALL be classified as RA-expensive
- **AND** it generates ~21 SelectInstruction chains per invocation in MIR

4 changes: 4 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,10 @@ if(ZEN_ENABLE_EVM)
add_definitions(-DZEN_ENABLE_EVM)
endif()

if(ZEN_ENABLE_JIT_PRECOMPILE_FALLBACK)
add_definitions(-DZEN_ENABLE_JIT_PRECOMPILE_FALLBACK)
endif()

if(ZEN_ENABLE_CPU_EXCEPTION)
if(ZEN_ENABLE_SINGLEPASS_JIT OR ZEN_ENABLE_MULTIPASS_JIT)
add_definitions(-DZEN_ENABLE_CPU_EXCEPTION)
Expand Down
Loading