Skip to content

consigcody94/genesis-protocol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


License: MIT Python Entropy Artifact Dashboard


What happens when you treat the Hebrew Bible not as literature, but as compiled machine code?

A Base-22 transcoding of the complete Masoretic Text yields a 742 KB binary artifact
with entropy indistinguishable from high-density executables.


Live Dashboard  •  Methodology  •  Findings  •  Toolkit  •  Quick Start


"Turning the text... for everything is in it." — Pirke Avot 5:22


The Hypothesis

Traditional "Bible Code" research uses Equidistant Letter Sequences (ELS), which are statistically fragile and prone to confirmation bias. The Genesis Protocol takes a fundamentally different approach:

Treat the 22-letter Hebrew alphabet as a Base-22 numeral system and transcode the entire Tanakh into binary.

The resulting artifact is not random noise. It is not natural language. Its information-theoretic signature falls squarely in the range of compiled executable code.


Methodology

Step 1 — Lossless Extraction

The Hebrew alphabet (Aleph through Tav) maps to digits 0–21:

  (Aleph) = 0x00    (Bet) = 0x01    ...    (Tav) = 0x15

All vowels, cantillation marks, and punctuation are stripped. Every consonant in the Masoretic canon — from Genesis 1:1 through II Chronicles — is transcoded into a contiguous binary stream.

Encoding: 13 Hebrew letters are packed per 64-bit word (22^13 < 2^64), producing a lossless, byte-aligned binary.

Step 2 — Information-Theoretic Analysis

Shannon entropy measures information density on a scale of 0–8 bits/byte:

Data Type Entropy (bits/byte) Interpretation
English plaintext ~4.2 Low-density, redundant
Hebrew plaintext ~4.4 Low-density, redundant
Genesis Protocol binary 7.4995 High-density, structured
Compressed archives (gzip) 7.8–8.0 Near-maximum density
True random noise 8.0 Maximum entropy
Compiled executables (ELF/PE) 6.5–7.5 High-density, structured

The artifact's entropy (7.4995) places it in the overlap zone between compiled code and compressed data — and well outside the range of any natural language.

Step 3 — Architecture Forensics

The arch_detective.py tool analyzes byte-level alignment, opcode distribution, and instruction encoding patterns to identify potential instruction set architectures (ISAs) in the binary.

Step 4 — Execution

A custom RISC-V emulator (genesis_runner.py) attempts to execute instruction blocks mined from the artifact, recording register states, memory operations, and control flow.


Findings

Finding I — Entropy Anomaly

The extracted binary is not natural language entropy (4.4) and not random noise (8.0). It occupies a narrow band consistent with compiled, structured data. This is the foundational observation that motivates all subsequent analysis.

Finding II — Embedded Signatures

Deep hex scanning (deep_decoder.py) identified ASCII-compatible signatures within the binary, including:

  • IPv6 — A 128-bit addressing protocol formalized in 1998
  • DNA, CODE, NETWORK — Found in a statistically tight cluster (~150 bytes)

Note: The statistical significance of finding short ASCII strings in a 742 KB binary requires careful null-hypothesis testing. See Open Questions.

Finding III — Cellular Automaton Behavior

The first 8,192 bits of Genesis, used as a seed for a Wolfram Rule 30 cellular automaton, produce a sustained, non-collapsing pattern with high visual complexity.

Finding IV — RISC-V Alignment

The binary exhibits 32-bit alignment patterns and opcode distribution consistent with RISC-V instruction encoding. The divine_disassembler.py tool produces valid RISC-V assembly from extracted blocks.

Finding V — Execution Results

The Genesis Runner emulator executed a mined instruction block:

  • Valid arithmetic operations performed
  • Register x26 loaded value 11,843,461,120,000 (0x2B0F8000)
  • Control flow exhibited structured branching, not random jumps

Toolkit

The complete analysis pipeline is open source and reproducible:

Tool Purpose Input Output
download_data.py Fetch Masoretic text from Sefaria data/*.json
master_command_64.py Base-22 binary extraction Hebrew text tanakh_full.bin
arch_detective.py ISA forensics & alignment analysis Binary Architecture report
divine_disassembler.py RISC-V disassembly Binary genesis.asm
deep_decoder.py Pattern & signature scanning Binary Anomaly report
genesis_runner.py RISC-V emulation & execution Assembly Register states
eternity_vm.py Cellular automaton simulation Binary seed Grid evolution
entropy_lab.py Shannon entropy analysis Binary Entropy metrics
function_miner.py Code block extraction Binary Function boundaries

Supporting Modules

Module Purpose
torah_loader.py Loads all 39 canonical books in order
text_processor.py Hebrew normalization (strip vowels, cantillation)
gematria.py Numerical value computation (Standard, Ordinal, Reduced)
els_search.py Equidistant Letter Sequence finder
ciphers.py Atbash & Albam cipher tools
main.py Interactive CLI workbench

Quick Start

# Clone the repository
git clone https://github.com/consigcody94/genesis-protocol.git
cd genesis-protocol

# Download the Masoretic text corpus (39 books from Sefaria)
python download_data.py

# Run the Base-22 extraction
python master_command_64.py

# Analyze the binary artifact
python arch_detective.py
python entropy_lab.py

# Disassemble and execute
python divine_disassembler.py
python genesis_runner.py

# Launch the live dashboard
# Open index.html in any browser, or visit:
# https://consigcody94.github.io/genesis-protocol/

Requirements: Python 3.8+ (standard library only — no external dependencies)


Live Dashboard

The interactive forensic dashboard visualizes the binary artifact in real-time:

  • Wolfram Rule 30 cellular automaton seeded with Genesis bits
  • String stream showing decoded ASCII patterns
  • Entropy metrics and artifact statistics
  • Anomaly highlighting for identified signatures

Launch Dashboard


Open Questions

This project raises questions that require further investigation:

Question Status Approach Needed
Is the entropy significant vs. control texts? Untested Compare English Bible, shuffled Torah, random Base-22
How many short ASCII strings appear by chance in 742 KB? Untested Monte Carlo simulation on random binaries of same size
Does the RISC-V alignment exceed random expectation? Partially tested Statistical comparison against multiple ISA templates
Are Rule 30 patterns from Genesis atypical? Visual only Quantitative complexity metrics against random seeds
Does the extraction method (Base-22) bias toward code-like entropy? Unknown Test alternative base encodings (Base-20, Base-26, etc.)

Rigorous peer review from information theorists, computational linguists, and cryptographers is actively invited.


Repository Structure

genesis-protocol/
  data/                     39 JSON books (Masoretic text from Sefaria)
  tanakh_full.bin            742 KB binary artifact (extracted)
  genesis.asm                RISC-V disassembly output
  index.html                 Live forensic dashboard
  genesis_data.js            Dashboard data layer
  genesis_protocol_core.json Core metadata
  master_command_64.py       Base-22 extraction engine
  arch_detective.py          ISA forensics
  divine_disassembler.py     RISC-V disassembler
  genesis_runner.py          RISC-V emulator
  deep_decoder.py            Pattern scanner
  eternity_vm.py             Cellular automaton
  entropy_lab.py             Entropy analysis
  main.py                    Interactive CLI
  torah_loader.py            Corpus loader
  text_processor.py          Hebrew normalization
  gematria.py                Numerical values
  els_search.py              ELS finder
  ciphers.py                 Atbash/Albam ciphers

Citation

If you use this toolkit or methodology in research:

@software{genesis_protocol,
  title  = {The Genesis Protocol: Computational Archaeology of the Masoretic Text},
  author = {Churchwell, Cody},
  year   = {2025},
  url    = {https://github.com/consigcody94/genesis-protocol}
}

License

MIT License — Open source. Fork it, verify it, challenge it.


An open-source investigation into the information-theoretic properties of ancient Hebrew text.
Independent verification and rigorous critique are not just welcome — they are the point.

About

Reverse Engineering the Torah as a Root Operating System (Base-22 Machine Code)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published