Skip to content

BP Decoder Performance: Fundamental Architecture Issues #38

@GiggleLiu

Description

@GiggleLiu

Problem Summary

The current BP decoder implementation shows only 2% improvement over a simple syndrome-parity baseline decoder, which is far below expected performance based on published research. After analyzing the code and comparing against recent literature, I've identified fundamental architectural issues that prevent the decoder from working properly.

Expected vs. Actual Performance

What We Should See (from Literature)

Based on recent 2024 research on BP decoders for surface codes:

What We Actually See

At p=0.03 (3% physical error):

Decoder LER Precision Recall F1
Syndrome-parity baseline 50.6% 35.6% 50.8% 0.418
BP Decoder 49.6% 36.1% 50.3% 0.421
Improvement 2.0% - - -

This is broken! At 3% physical error, we're getting ~50% logical error rate, which means the decoder is barely better than random guessing.

Root Cause Analysis

I've identified three fundamental architectural issues:

Issue 1: Wrong Factor Graph Structure

Current UAI Model (from dem_to_uai()):

# Lines 174-182 in dem.py
n_detectors = dem.num_detectors  # 24 for d=3
lines.append(str(n_detectors))    # Variables = DETECTORS
lines.append(str(len(errors)))    # Factors = ERROR MECHANISMS

This creates:

  • Variables = Detectors (binary: fired=1, not fired=0)
  • Factors = Error mechanisms (one factor per DEM error)

The Problem:

  • This models P(detectors | errors), the forward model
  • For decoding, we need P(errors | syndrome), the inverse model
  • Setting detectors as "evidence" makes no sense—we already KNOW the detector states from the syndrome!

Issue 2: Inference on the Wrong Quantity

What the code does:

# examples/tanner_graph_walkthrough.py, line 438
evidence = {det_idx + 1: int(syndrome[det_idx]) for det_idx in range(len(syndrome))}
bp_ev = apply_evidence(bp, evidence)
state, info = belief_propagate(bp_ev, ...)
marginals = compute_marginals(state, bp_ev)  # Marginals over DETECTORS!

The Problem:

  • After setting evidence, BP computes marginals over detector variables
  • But we already know the detector values (from the syndrome)!
  • What we don't know is which errors occurred
  • The marginals over detectors are useless for decoding

Issue 3: Decoding Logic Outside BP

Current approach:

# Lines 457-499 in examples/tanner_graph_walkthrough.py  
# Build error likelihood scores based on syndrome
for det_idx in syndrome_active:
    errors_for_detector = np.where(H[det_idx, :] == 1)[0]
    error_likelihoods[errors_for_detector] *= 2.0  # Ad-hoc boost
    
# Apply parity heuristic
if syndrome_weight % 2 == 1:
    flip_likelihood_avg *= 3.0  # Ad-hoc multiplier

The Problem:

  • The "decoding" happens outside BP using ad-hoc heuristics
  • This completely bypasses the power of belief propagation!
  • The 2% improvement comes from these heuristics, not from BP
  • It's essentially a fancy wrapper around syndrome-parity decoding

What a Proper BP Decoder Should Do

A correct BP decoder for surface codes should:

1. Correct Factor Graph

Variables = Error mechanisms (binary: occurred=1, didn't occur=0)
Factors = Parity check constraints (one per detector)
Evidence = Observed syndrome s

2. Proper Inference

  • Enforce syndrome constraints: s_i = ⊕_{j: H[i,j]=1} e_j (parity checks)
  • Run BP to compute marginal probabilities over errors
  • Decode by finding most likely error configuration

3. Maximum Likelihood Decoding

error_config = argmax P(e | s) 
             = argmax P(s | e) · P(e)  [Bayes' rule]
             = argmax [∏_i δ(s_i = ⊕_j H[i,j]e_j)] · [∏_j p_j^{e_j}(1-p_j)^{1-e_j}]

Proposed Solutions

Option 1: Build Correct Factor Graph

Rewrite dem_to_uai() to create:

  • Variables: Error mechanisms (286 binary variables for d=3, r=3)
  • Factors: Parity constraints (24 factors, one per detector)
  • Each factor enforces: detector_i = XOR(errors affecting detector i)

Option 2: Use Existing Tools

Consider using proven implementations:

  • PyMatching (GitHub) - MWPM decoder, ~1% threshold
  • LDPC (bp_osd) - BP+OSD implementation
  • Stim built-in decoders for benchmarking

Option 3: Implement Modern BP Variants

From the 2024 literature:

  • EWAInit-BP: Adaptive prior probability updates
  • MBP: Momentum-based belief propagation
  • BlockBP: Outperforms MWPM by >10x in some regimes

Impact

This affects:

References

  1. Improved Belief Propagation Decoding on Surface Codes (2024)
  2. Quantum Error Correction Below the Surface Code Threshold
  3. Error Correction Zoo: Surface Code
  4. PyMatching: MWPM Decoder
  5. BP-OSD Decoder

Next Steps

I propose:

  1. Document this issue (this issue)
  2. Add warning to current tutorial about performance limitations
  3. Design correct factor graph structure
  4. Implement proper BP decoder
  5. Benchmark against PyMatching to validate

Should I proceed with designing the corrected architecture?


Labels: bug, enhancement, research, decoder
Priority: High - affects core functionality

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions