obsgolem · jaycrick · Jan 16, 2026 · Jan 16, 2026 · Jan 16, 2026 · Jan 16, 2026
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,3 @@
 generated_markdown/**
+generated_indices/**
 *.pyc
diff --git a/Readme.md b/Readme.md
@@ -6,6 +6,7 @@ These were created by using OCR to extract the text from the book, then manually
 The following tasks are things I would consider useful for others, and would love help with.
 * [x] Convert word lists into a machine friendly format, probably JSON.
 * [x] Apply unicode normalization to NFD to both markdown and JSON formats.
+* [x] Create a master word index showing first occurrence by chapter.
 * [ ] Add line number and word index information for the location of the word in the book.
 * [ ] Macronize vocab list.
 
@@ -14,5 +15,38 @@ To use the lists effectively, I recommend finding a tool that lets you perform d
 
 At the top of every file is the page numbers for the exercises of that chapter.
 
+## Scripts
+
+### create_word_index.py
+Creates master alphabetical indexes of all vocabulary words across all chapters, showing the chapter number where each word first appears. Sorting is case-insensitive with proper Unicode normalization.
+
+**Usage:**
+```bash
+# Create regular alphabetical index (all formats: JSON, Markdown, HTML)
+python3 create_word_index.py
+
+# Include both regular and sectioned indexes
+python3 create_word_index.py --include-sectioned
+
+# Create only the sectioned index (organized by grammatical sections)
+python3 create_word_index.py --sectioned-only
+
+# HTML only with custom page density
+python3 create_word_index.py --format html --entries-per-page 90
+
+# All formats with custom output filenames
+python3 create_word_index.py --include-sectioned \
+  --json-output my_index.json \
+  --section-json-output my_index_by_section.json
+```
+
+**Output:**
+- `word_index.json` — Flat alphabetical index
+- `word_index.md` — Flat alphabetical markdown table
+- `word_index.html` — Print-optimized HTML with 3-column pagination (102 entries/page)
+- `word_index_by_section.json` — Index organized by grammatical sections (14 total)
+- `word_index_by_section.md` — Sectioned markdown with sections as headers
+- `word_index_by_section.html` — Sectioned HTML with responsive column layout (adapts to browser width, up to 4 columns), table of contents with anchor links, and adaptive font sizing for readability
+
 # Copyright & License
 The copyright of the word lists remain with the original authors, and if they dislike my public reproduction of their lists then I am fully willing to take this repo down. All code and other novel material in this repository is licensed under the terms of the MIT license.