Skip to content
View johnzfitch's full-sized avatar

Block or report johnzfitch

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
johnzfitch/README.md

Header

definitelynot.ai  Internet Universe  Email

John Zachary Fitch

Philosophy

Software engineer with mathematics background specializing in systems programming, security research, and AI/ML applications.

Portal badge (no tracking)

Education

UC Berkeley Mathematics


OpenAI Codex: Finding the Ghost in the Machine

TL;DR: Solved a pre-main() environment stripping bug causing 11-300x GPU slowdowns that eluded OpenAI's debugging team for months.

Proof: Issue #8945 | PR #8951 | Release notes (rust-v0.80.0)

Full Investigation Details

The Ghost

In October 2025, OpenAI assembled a specialized debugging team to investigate mysterious slowdowns affecting Codex - their own coding tool they use to write OpenAI's code. After a week of intensive investigation: nothing.

The bug was literally a ghost - pre_main_hardening() executed before main(), stripped critical environment variables (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH), and disappeared without a trace. Standard profilers saw nothing. Users saw variables in their shell, but inside codex exec they vanished.


The Hunt

Within 3 days of their announcement, I identified the problematic commit PR #4521 and contacted @tibo_openai.

But identification is not proof. I spent 2 months building an undeniable case.

Timeline

  • Sept 30, 2025 - PR #4521 merges, enabling pre_main_hardening() in release builds
  • Oct 1, 2025 - rust-v0.43.0 ships (first affected release)
  • Oct 6, 2025 - First "painfully slow" regression reports
  • Oct 1-29, 2025 - Spike in env/PATH inheritance issues across platforms
  • Oct 29, 2025 - Emergency PATH fix lands (did not catch root cause)
  • Late Oct 2025 - OpenAI's specialized team investigates, declares there is no root cause, identifies issue as user behavior change
  • Jan 9, 2026 - My fix merged, credited in release notes

Evidence Collected

Platform Issues Failure Mode
macOS #6012, #5679, #5339, #6243, #6218 DYLD_* stripping breaking dynamic linking
Linux/WSL2 #4843, #3891, #6200, #5837, #6263 LD_LIBRARY_PATH stripping -> silent CUDA/MKL degradation

Compiled evidence packages:

  • Platform-specific failure modes and reproduction steps

  • Quantifiable performance regressions (11-300x) with benchmarks

  • Pattern analysis across 15+ scattered user reports over 3 months

  • Process environment inheritance trace through fork/exec boundaries

  • Comprehensive Technical Analysis

  • Investigation Methodology


Why Conventional Debugging Failed

The bug was designed to be invisible:

  • Pre-main execution - Used #[ctor::ctor] to run before main(), before any logging/instrumentation
  • Silent stripping - No warnings, no errors, just missing environment variables
  • Distributed symptoms - Appeared as unrelated issues across different platforms/configs
  • User attribution - Everyone assumed they misconfigured something (shell looked fine)
  • Wrong search space - Team was debugging post-main application code

Standard debugging tools cannot see pre-main execution. Profilers start at main(). Log hooks are not initialized yet. The code executes, modifies the environment, and vanishes.


The Impact

OpenAI confirmed and merged the fix within 24 hours, explicitly crediting the investigation in v0.80.0 release notes:

"Codex CLI subprocesses again inherit env vars like LD_LIBRARY_PATH/DYLD_LIBRARY_PATH to avoid runtime issues. As explained in #8945, failure to pass along these environment variables to subprocesses that expect them (notably GPU-related ones), was causing 10x+ performance regressions! Special thanks to @johnzfitch for the detailed investigation and write-up in #8945."

Restored:

  • GPU acceleration for internal ML/AI dev teams
  • CUDA/PyTorch functionality for ML researchers
  • MKL/NumPy performance for scientific computing users
  • Conda environment compatibility
  • Enterprise database driver support

When the tools are blind, the system lies, and everyone else has stopped looking for it. This is the type of problem I love solving.


Selected Work

  • Observatory - WebGPU deepfake detection running 4 ML models in browser (live demo)
  • specHO - LLM watermark detection via phonetic/semantic analysis (The Echo Rule)
  • filearchy - COSMIC Files fork with sub-10ms trigram search (Rust)
  • nautilus-plus - Enhanced GNOME Files with sub-ms search (AUR)
  • indepacer - PACER CLI for federal court research (PyPI: pacersdk)

Self-hosting bare metal infrastructure (NixOS) with post-quantum cryptography, authoritative DNS, and containerized services.


Featured

Observatory - WebGPU Deepfake Detection

Live Demo: look.definitelynot.ai

Browser-based AI image detection running 4 specialized ML models (ViT, Swin Transformer) through WebGPU. Zero server-side processing; all inference happens client-side with 672MB of ONNX models.

Model Accuracy Architecture
dima806_ai_real 98.2% Vision Transformer
SMOGY 98.2% Swin Transformer
Deep-Fake-Detector-v2 92.1% ViT-Base
umm_maybe 94.2% Vision Transformer

Stack: JavaScript (ES6), Transformers.js, ONNX, WebGPU/WASM


iconics - Semantic Icon Library

3,372+ PNG icons with semantic CLI discovery. Find the right icon by meaning, not filename.

icon suggest security       # -> lock, shield, key, firewall...
icon suggest data           # -> chart, database, folder...
icon use lock shield        # Export to ./icons/

Features: Fuzzy search, theme variants, batch export, markdown integration Stack: Python, FuzzyWuzzy, PIL


filearchy + triglyph - Sub-10ms File Search

COSMIC Files fork with embedded trigram search engine. Memory-mapped indices achieve sub-millisecond searches across 2.15M+ files with near-zero resident memory.

filearchy/
|-- triglyph/      # Trigram library (mmap)
`-- triglyphd/     # D-Bus daemon for system-wide search

Performance: 2.15M files indexed, sub-10ms query time, 156MB index on disk Stack: Rust, libcosmic, memmap2, zbus


The Echo Rule - LLM Detection Methodology

LLMs echo their training data. That echo is detectable through pattern recognition:

Signature Detection Method
Phonetic CMU phoneme analysis, Levenshtein distance
Structural POS tag patterns, sentence construction
Semantic Word2Vec cosine similarity, hedging clusters

Implemented in specHO with 98.6% preprocessor test pass rate. Live demo at definitelynot.ai.


Skills

Technical focus - skills breakdown

Core: Rust | Python | TypeScript | C | Nix | Shell


Project Dashboard

Projects grouped by category
Text project index (copyable)

AI / ML

Security Research

  • eero (private) - Mesh WiFi router security analysis
  • blizzarchy (private) - OAuth analysis and telemetry RE
  • featherarchy - Security-hardened Monero wallet fork
  • alienware-monitor (private) - Firmware RE
  • proxyforge (private) - Transparent MITM proxy

Systems Programming

CLI Tools

Desktop / Linux

Web / Mobile

Infrastructure

  • NixOS Server (private) - Post-quantum SSH, Rosenpass VPN, authoritative DNS
  • unbound-config (private) - Recursive DNS with DNSSEC and ad blocking

Infrastructure

Primary server: Dedicated bare-metal NixOS host (details available on request)

  • Security: Post-quantum SSH, Rosenpass VPN, nftables firewall
  • DNS: Unbound resolver with DNSSEC, ad/tracker blocking
  • Services: FreshRSS, Caddy (HTTPS/HTTP3), cPanel/WHM, Podman containers
  • Network: Local 10Gbps, authoritative BIND9 with RFC2136 ACME
Infrastructure matrix
Service Technology
Security Post-quantum SSH, Rosenpass VPN, nftables firewall
DNS Unbound resolver with DNSSEC, ad/tracker blocking
Services FreshRSS, Caddy (HTTPS/HTTP3), cPanel/WHM, Podman containers
Network Local 10Gbps, authoritative BIND9 with RFC2136 ACME

SF Bay Area | Open to remote | Icons from iconics

Pinned Loading

  1. specHO specHO Public

    An LLM watermark (pattern recognition) suite

    Python 1

  2. definitelynot.ai definitelynot.ai Public

    AI Text Linter

    JavaScript 2

  3. iconics iconics Public

    A semantic icon library with intelligent tagging and discovery

    Python 2

  4. arch-dependency-matrices arch-dependency-matrices Public

    Mathematical analysis of 1,553 Arch Linux package dependencies using graph theory, spectral analysis, and linear algebra

    Python

  5. NetworkBatcher NetworkBatcher Public

    Energy-efficient network request batching for iOS 26+

    Swift

  6. claude-desktop-arch claude-desktop-arch Public

    Enable Claude Code preview in Claude Desktop on Arch Linux

    Shell