Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
File renamed without changes.
100 changes: 81 additions & 19 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,73 @@
# SpecHO Development Guide
# <img src="icons/compass.png" width="32" height="32"> SpecHO Development Guide

**Version:** 2.0
**Type:** Machine-readable project specification
**Target:** Claude Code AI assistant
**Version:** 3.0
**Type:** Machine-readable project specification
**Target:** Claude Code AI assistant
**Purpose:** Watermark detection system for AI-generated text

---

## DOCUMENTATION PROTOCOL (MANDATORY)

### Start Claude from Project Directory
```bash
cd ~/dev/specHO && claude
```
Verify with `/memory` - must show both global and project CLAUDE.md.

### Active Documents (ONLY THESE)

| Document | Purpose | Update When |
|----------|---------|-------------|
| docs/TASKS.md | Task specifications | Tasks added/changed |
| docs/SPECS.md | Tier specifications | Specs refined |
| docs/IMPLEMENTATION.md | Learnings, gotchas | After each session |
| docs/DEPLOYMENT.md | Operations | Infra changes |
| docs/STATUS.md | Current state, AI context | After each session |

### Content Routing

```
NEW CONTENT
├─ Session work in progress? → working/session-YYYY-MM-DD.md
├─ Task spec/configuration? → docs/TASKS.md or docs/SPECS.md
├─ Learning/gotcha/validation? → docs/IMPLEMENTATION.md
├─ Deployment/operations? → docs/DEPLOYMENT.md
├─ Current status/next steps? → docs/STATUS.md
└─ Theory/design? → architecture.md
```

### Session Protocol

**START**:
1. Read docs/STATUS.md for current state
2. Create `working/session-YYYY-MM-DD.md`

**DURING**: Log work, note insights in session file

**END**:
1. Extract insights → append to docs/IMPLEMENTATION.md
2. Update docs/STATUS.md with new state
3. Move session file to `docs/archive/sessions/`

### Anti-Patterns (DO NOT)

- Create new top-level .md files without explicit approval
- Create CONTEXT_*, HANDOFF_*, summary* files
- Leave working/ files after session ends
- Use file-based references (use `[DOC.md#section]` anchors)

### Reference Format

```markdown
✅ See [IMPLEMENTATION.md#preprocessor-gotchas]
❌ See session2.md
```

---

---

## PROJECT_METADATA

```yaml
Expand All @@ -26,17 +87,17 @@ architecture: Five-component sequential pipeline
```yaml
navigation_files:
setup:
- file: docs/QUICKSTART.md
purpose: Initial environment setup and Task 1.1 implementation
read_when: First session
- file: docs/archive/legacy/QUICKSTART.md
purpose: Initial environment setup and Task 1.1 implementation (archived)
read_when: First session or environment setup reference

- file: architecture.md
purpose: Original watermark design specification
read_when: Need context on Echo Rule algorithm
- file: summary.md
purpose: All of the work we have done so far, summarized. You are returning from a /clear command and must catch yourself back up to where we are in our project.
read_when: Returning from a /clear command

- file: docs/archive/legacy/summary.md
purpose: Historical project summary (archived)
read_when: Need context on early development decisions

implementation:
- file: docs/TASKS.md
Expand Down Expand Up @@ -182,11 +243,12 @@ SpecHO/
│ ├── models/
│ └── corpus/
└── docs/
├── QUICKSTART.md
├── TASKS.md
├── SPECS.md
├── DEPLOYMENT.md
└── PHILOSOPHY.md
├── PHILOSOPHY.md
└── archive/ # Historical docs
└── legacy/QUICKSTART.md
```

---
Expand Down Expand Up @@ -338,14 +400,14 @@ optional_tier_2:

When user starts first Claude Code session:

1. Read docs/QUICKSTART.md for environment setup
2. Implement Task 1.1 (SpecHO/models.py)
3. Create all 5 dataclasses with type hints and docstrings
1. Read docs/STATUS.md for current project state
2. Read docs/TASKS.md for task specifications
3. For environment setup reference, see docs/archive/legacy/QUICKSTART.md
4. Use Python 3.11+ features
5. No processing logic - data structures only
5. Follow DOCUMENTATION PROTOCOL in CLAUDE.md

Example first prompt to expect:
"Read QUICKSTART.md and help me implement Task 1.1: Create Core Data Models"
"Read STATUS.md and help me continue from where we left off"

---

Expand Down
49 changes: 25 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# SpecHO - Echo Rule Watermark Detector
# <img src="icons/sound.png" width="32" height="32"> SpecHO - Echo Rule Watermark Detector

Watermark detection system for AI-generated text using phonetic, structural, and semantic echo analysis.

## Quick Start
## <img src="icons/rocket.png" width="24" height="24"> Quick Start

```bash
# Clone and setup
Expand All @@ -17,35 +17,36 @@ python -m spacy download en_core_web_sm
python scripts/cli.py --file sample.txt
```

## Project Status
## <img src="icons/bar-chart.png" width="24" height="24"> Project Status

**Current:** Tier 1 (MVP) - In Development
**Tasks Complete:** 0/32
**Test Coverage:** 0%
**Current:** Tier 1 (MVP) - Clause Identifier in progress
**Completed:** Preprocessor (98.6% tests passing)
**See:** [docs/STATUS.md](docs/STATUS.md) for current state

## Architecture
## <img src="icons/blueprint.png" width="24" height="24"> Architecture

Five-component sequential pipeline:
1. **Linguistic Preprocessor** - Tokenization, POS tagging, phonetic transcription
2. **Clause Pair Identifier** - Find related clause pairs using rule-based system
3. **Echo Analysis Engine** - Phonetic, structural, and semantic similarity scoring
4. **Scoring & Aggregation** - Weighted combination of echo scores
5. **Statistical Validator** - Z-score comparison against human baseline
1. **Preprocessor** - Tokenization, POS tagging, phonetic transcription
2. **Clause Identifier** - Find related clause pairs (in progress)
3. **Echo Engine** - Phonetic, structural, semantic similarity
4. **Scoring** - Weighted combination of echo scores
5. **Validator** - Z-score comparison against baseline

## Development
## <img src="icons/document.png" width="24" height="24"> Documentation

Using three-tier approach:
- **Tier 1 (Weeks 1-12):** MVP with simple algorithms
- **Tier 2 (Weeks 13-17):** Production hardening
- **Tier 3 (Week 18+):** Research features
| Document | Purpose |
|----------|---------|
| <img src="icons/compass.png" width="16" height="16"> [CLAUDE.md](CLAUDE.md) | AI instructions + documentation protocol |
| <img src="icons/bar-chart.png" width="16" height="16"> [docs/STATUS.md](docs/STATUS.md) | Current state + AI context |
| <img src="icons/task-list.png" width="16" height="16"> [docs/TASKS.md](docs/TASKS.md) | All 32 task specifications |
| <img src="icons/document.png" width="16" height="16"> [docs/SPECS.md](docs/SPECS.md) | Tier specifications |
| <img src="icons/wrench.png" width="16" height="16"> [docs/IMPLEMENTATION.md](docs/IMPLEMENTATION.md) | Learnings + gotchas |
| <img src="icons/rocket.png" width="16" height="16"> [docs/DEPLOYMENT.md](docs/DEPLOYMENT.md) | Operations guide |
| <img src="icons/blueprint.png" width="16" height="16"> [architecture.md](architecture.md) | Echo Rule theory |

**Documentation:**
- `CLAUDE.md` - Main development guide
- `docs/QUICKSTART.md` - Setup and first task
- `docs/TASKS.md` - All 32 task specifications
- `architecture.md` - Original watermark design
**Archive:** Historical session docs in `docs/archive/`

## Usage
## <img src="icons/wrench.png" width="24" height="24"> Usage

```python
from SpecHO import SpecHODetector, load_config
Expand All @@ -57,7 +58,7 @@ print(f"Confidence: {result.confidence:.2%}")
print(f"Z-Score: {result.z_score:.2f}")
```

## Requirements
## <img src="icons/task-list.png" width="24" height="24"> Requirements

- Python 3.11+
- spaCy with en_core_web_sm model
Expand Down
Loading