Skip to content

Status optimization#436

Open
evisdren wants to merge 12 commits intomainfrom
status_optimization
Open

Status optimization#436
evisdren wants to merge 12 commits intomainfrom
status_optimization

Conversation

@evisdren
Copy link
Contributor

@evisdren evisdren commented Feb 20, 2026

Replaces shelling out to a git process by using a pure go function to find the HEAD branch name. Shaves another 10ms off entire status.

image

Copilot AI review requested due to automatic review settings February 20, 2026 00:59
@evisdren evisdren requested a review from a team as a code owner February 20, 2026 00:59
@cursor
Copy link

cursor bot commented Feb 20, 2026

PR Summary

Medium Risk
Touches git-path resolution and branch detection used by status; filesystem parsing and caching could mis-handle unusual worktree/HEAD formats or stale cwd-based cache in edge cases.

Overview
Improves entire status performance by caching git rev-parse --git-common-dir in session.getGitCommonDir() and by replacing resolveWorktreeBranch’s git rev-parse call with direct reads of .git/HEAD (including worktree .git file support).

Adds a new benchutil fixture package plus benchmarks for status and checkpoint hot paths (WriteTemporary, WriteCommitted, tree helpers), and introduces mise tasks (bench, bench:cpu, bench:mem) to run/profiling benchmarks. Updates tests to cover the new caching and HEAD-parsing behavior, and updates the Discord invite link in CONTRIBUTING.md.

Written by Cursor Bugbot for commit f413d34. Configure here.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the entire status command by replacing subprocess calls with direct filesystem reads and adding caching for git operations. The optimization targets two hot paths: resolveWorktreeBranch() (which previously spawned a git subprocess per worktree) and getGitCommonDir() (which now caches results per working directory). The PR also introduces a comprehensive benchmark infrastructure (benchutil package) to measure and track performance improvements, and updates Discord invite links in CONTRIBUTING.md.

Changes:

  • Replaced resolveWorktreeBranch() subprocess call with direct filesystem reads of .git/HEAD
  • Added caching to getGitCommonDir() to avoid repeated git subprocess invocations
  • Introduced benchutil package with test fixtures for realistic git repositories, transcripts, and session states
  • Added benchmark tasks to mise.toml and comprehensive benchmarks for status command and checkpoint operations

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated no comments.

Show a summary per file
File Description
cmd/entire/cli/status.go Rewrote resolveWorktreeBranch to read .git/HEAD directly from filesystem instead of spawning git subprocess
cmd/entire/cli/status_test.go Added comprehensive tests for resolveWorktreeBranch covering regular repos, detached HEAD, worktrees with absolute/relative paths
cmd/entire/cli/session/state.go Added caching to getGitCommonDir following the established RepoRoot pattern with RWMutex and per-directory cache
cmd/entire/cli/session/state_test.go Added tests for git common dir caching behavior including cache hits, invalidation, and directory changes
cmd/entire/cli/setup_test.go Added ClearGitCommonDirCache() call to test setup for proper isolation
cmd/entire/cli/bench_test.go Added end-to-end benchmarks for status command with/without caching to measure optimization impact
cmd/entire/cli/benchutil/benchutil.go New package providing test fixtures (repos, sessions, transcripts) for benchmark tests
cmd/entire/cli/benchutil/benchutil_test.go Benchmarks for the benchutil helpers themselves
cmd/entire/cli/checkpoint/bench_test.go Added benchmarks for WriteTemporary and WriteCommitted hot paths with various scenarios
mise.toml Added bench, bench:cpu, and bench:mem tasks for running benchmarks with profiling
CONTRIBUTING.md Updated Discord invite link
Comments suppressed due to low confidence (4)

mise.toml:104

  • There are several typos in the echo message. "find." should be "find ." (with space), and "/path/to/.cpu.prof" should be "/path/to/cpu.prof" (without the dot prefix).
run = "go test -bench=. -benchmem -run='^$' -cpuprofile=cpu.prof -timeout=10m ./... && echo 'CPU Profiles saved as cpu.prof in each benchmarked package directory.  List them with : find. -name cpu.prof -print. View one with: go tool pprof -http=:8080 /path/to/.cpu.prof'" 

mise.toml:108

  • There's a typo in the echo message: "List with them:" should be "List them with:". Also, the path should be "/path/to/mem.prof" not with extra space.
run = "go test -bench=. -benchmem -run='^$' -memprofile=mem.prof -timeout=10m ./... && echo 'Memory profiles saved as mem.prof in each benchmarked package directory. List with them: find . -name mem.prof -print. View one with: go tool pprof -http=:8080 /path/to/mem.prof'"

cmd/entire/cli/bench_test.go:26

  • The comment on line 26 states that resolveWorktreeBranch uses "git rev-parse --abbrev-ref HEAD" subprocess call, but this is no longer accurate after the optimization. The function now reads the HEAD file directly from the filesystem without spawning a subprocess. Consider updating this comment to reflect the current implementation, such as: "Filesystem read of .git/HEAD (resolveWorktreeBranch, per unique worktree)".
//   - git rev-parse --abbrev-ref HEAD (resolveWorktreeBranch, per unique worktree)

cmd/entire/cli/benchutil/benchutil_test.go:97

  • The code uses b.Loop() which is a Go 1.24+ feature for benchmark loops. While this is valid for Go 1.25.6 (as specified in mise.toml and go.mod), most other benchmarks in this PR use the standard for range b.N pattern (see bench_test.go and checkpoint/bench_test.go). Consider using the consistent pattern for range b.N for uniformity across the codebase, unless there's a specific reason to use the newer Loop() API.
func BenchmarkNewBenchRepo(b *testing.B) {
	for b.Loop() {
		NewBenchRepo(b, RepoOpts{})
	}
}

func BenchmarkNewBenchRepo_Large(b *testing.B) {
	for b.Loop() {
		NewBenchRepo(b, RepoOpts{
			FileCount:     50,
			FileSizeLines: 500,
		})
	}
}

func BenchmarkSeedShadowBranch(b *testing.B) {
	for b.Loop() {
		b.StopTimer()
		repo := NewBenchRepo(b, RepoOpts{FileCount: 10})
		sessionID := repo.CreateSessionState(b, SessionOpts{})
		b.StartTimer()

		repo.SeedShadowBranch(b, sessionID, 5, 3)
	}
}

func BenchmarkSeedMetadataBranch(b *testing.B) {
	for b.Loop() {
		b.StopTimer()
		repo := NewBenchRepo(b, RepoOpts{FileCount: 10})
		b.StartTimer()

		repo.SeedMetadataBranch(b, 10)
	}
}

func BenchmarkGenerateTranscript(b *testing.B) {
	b.Run("Small_20msg", func(b *testing.B) {
		for b.Loop() {
			GenerateTranscript(TranscriptOpts{
				MessageCount:    20,
				AvgMessageBytes: 500,
			})
		}
	})

	b.Run("Medium_200msg", func(b *testing.B) {
		for b.Loop() {
			GenerateTranscript(TranscriptOpts{
				MessageCount:    200,
				AvgMessageBytes: 500,
			})
		}
	})

	b.Run("Large_2000msg", func(b *testing.B) {
		for b.Loop() {
			GenerateTranscript(TranscriptOpts{
				MessageCount:    2000,
				AvgMessageBytes: 500,
			})
		}
	})

	b.Run("WithToolUse", func(b *testing.B) {
		files := []string{"src/main.go", "src/util.go", "src/handler.go"}
		for b.Loop() {
			GenerateTranscript(TranscriptOpts{
				MessageCount:    200,
				AvgMessageBytes: 500,
				IncludeToolUse:  true,
				FilesTouched:    files,
			})
		}
	})
}

func BenchmarkCreateSessionState(b *testing.B) {
	repo := NewBenchRepo(b, RepoOpts{FileCount: 10})

	b.ResetTimer()
	for b.Loop() {
		repo.CreateSessionState(b, SessionOpts{
			Phase:        session.PhaseActive,
			StepCount:    5,
			FilesTouched: []string{"src/file_000.go", "src/file_001.go", "src/file_002.go"},
		})
	}
}

@evisdren
Copy link
Contributor Author

PR Summary

Medium Risk Touches git-path resolution and branch detection used by status; filesystem parsing and caching could mis-handle unusual worktree/HEAD formats or stale cwd-based cache in edge cases.

Overview Improves entire status performance by caching git rev-parse --git-common-dir in session.getGitCommonDir() and by replacing resolveWorktreeBranch’s git rev-parse call with direct reads of .git/HEAD (including worktree .git file support).

Adds a new benchutil fixture package plus benchmarks for status and checkpoint hot paths (WriteTemporary, WriteCommitted, tree helpers), and introduces mise tasks (bench, bench:cpu, bench:mem) to run/profiling benchmarks. Updates tests to cover the new caching and HEAD-parsing behavior, and updates the Discord invite link in CONTRIBUTING.md.

Written by Cursor Bugbot for commit f413d34. Configure here.

Mainly for posterity:"

  Risk 1: Unusual HEAD/worktree formats in resolveWorktreeBranch
                                                                                                    
  Known HEAD file formats:                                                                        
                                                                                                  
  Format: Branch ref                              
  Example: ref: refs/heads/main
  Our handling: Returns "main"
  git rev-parse handling: Returns "main"
  ────────────────────────────────────────
  Format: Detached (hash)
  Example: abc123...
  Our handling: Returns "HEAD"
  git rev-parse handling: Returns "HEAD"
  ────────────────────────────────────────
  Format: Tag ref
  Example: ref: refs/tags/v1.0
  Our handling: Returns "HEAD"
  git rev-parse handling: Returns "tags/v1.0"
  ────────────────────────────────────────
  Format: Remote ref
  Example: ref: refs/remotes/origin/main
  Our handling: Returns "HEAD"
  git rev-parse handling: Returns "remotes/origin/main"

  Known .git formats:

  ┌──────────────────────┬──────────────────────────┬──────────────────────────────────────────────┐
  │        Format        │         Scenario         │                 Our handling                 │
  ├──────────────────────┼──────────────────────────┼──────────────────────────────────────────────┤
  │ Directory            │ Regular repo             │ Read .git/HEAD directly                      │
  ├──────────────────────┼──────────────────────────┼──────────────────────────────────────────────┤
  │ File (gitdir: <abs>) │ Worktree (absolute path) │ Follow path, read HEAD                       │
  ├──────────────────────┼──────────────────────────┼──────────────────────────────────────────────┤
  │ File (gitdir: <rel>) │ Worktree (relative path) │ Resolve relative to worktree dir, read HEAD  │
  ├──────────────────────┼──────────────────────────┼──────────────────────────────────────────────┤
  │ Symlink              │ Some container setups    │ os.Stat follows symlinks, works as directory │
  └──────────────────────┴──────────────────────────┴──────────────────────────────────────────────┘

  Behavioral difference: tag/remote refs — we return "HEAD" while the git subprocess returns the
  ref's short name. In practice this doesn't matter because HEAD only points to non-branch refs in
  extremely unusual manual operations, and the output is display-only (it shows in entire status as
  the branch label next to a worktree path). An incorrect label has zero functional impact — no data
   is written, no decisions are made from it.

  Unreachable cases: Bare repos never reach this function because runStatus exits at line 42
  (paths.RepoRoot() fails for bare repos). Submodules work correctly — their .git is a file with
  gitdir: just like worktrees.

  Degradation behavior: Every error path returns "", same as the old implementation on failure. The
  caller at line 248 just skips showing the branch name. No crash, no wrong action taken.

  Test coverage: 6 tests covering regular repo, detached HEAD, worktree (absolute path), worktree
  (relative path), non-existent path, not-a-repo.

  Risk level: Low. The formats are stable (unchanged for 15+ years), we handle all common cases
  identically, the one edge case difference (tag/remote refs) is display-only and functionally
  irrelevant, and all errors degrade gracefully.

  ---
  Risk 2: Stale cwd-based cache in getGitCommonDir

  How the cache works: Stores the result keyed by os.Getwd(). On subsequent calls, if cwd matches
  the cached key, returns the cached value. If cwd changed, falls through to the git subprocess.

  What could make it stale:
  1. cwd changes mid-process — Cache invalidates automatically (different key). Safe.
  2. Git repo structure changes while process runs (e.g., converting repo to worktree) — Cache would
   be stale. But this can't happen during a single entire status invocation (milliseconds), and even
   if it did, the git common dir doesn't change unless you restructure the .git directory itself.
  3. os.Getwd() fails — We use "" as the key. If two calls both fail, we'd get a stale hit. But if
  Getwd() fails, we're in a broken environment anyway.

  This is the exact same pattern as paths.RepoRoot() (lines 60-92 in paths.go), which has been in
  production. The cache is strictly additive — on a miss, it falls through to the real subprocess.
  The worst case is a redundant subprocess call, never a wrong answer (since the subprocess is the
  canonical source).

  Test coverage: 5 tests covering valid return, cache hit, cache clear, cwd change invalidation,
  error outside repo.

  Risk level: Very low. The pattern is proven (RepoRoot uses it), CLI processes are short-lived
  (cache barely matters per-process — it mainly helps when getGitCommonDir is called multiple times
  within one invocation), and the fallback is always the real subprocess.

  ---
  Summary

  ┌─────────────────┬──────────────────────┬───────────────────────────┬──────────────────────────┐
  │      Risk       │       Severity       │        Likelihood         │        Mitigation        │
  ├─────────────────┼──────────────────────┼───────────────────────────┼──────────────────────────┤
  │ Unusual HEAD    │ Display-only         │ Very rare                 │ Graceful degradation to  │
  │ format          │ cosmetic diff        │                           │ ""                       │
  ├─────────────────┼──────────────────────┼───────────────────────────┼──────────────────────────┤
  │ Stale cwd-based │ Returns old          │ Nearly impossible in      │ Same proven pattern as   │
  │  cache          │ git-common-dir       │ short-lived CLI           │ RepoRoot()               │
  ├─────────────────┼──────────────────────┼───────────────────────────┼──────────────────────────┤
  │ Corrupt .git    │ Would return ""      │ External problem          │ Same behavior as         │
  │ file            │                      │                           │ subprocess failure       │
  └─────────────────┴──────────────────────┴───────────────────────────┴──────────────────────────┘

  The risk assessment is accurate in identifying the right areas but overstates the practical
  impact. Both changes are display/performance-only with no write-path consequences, and every
  failure mode degrades to the same behavior as the old code on error.
  

ref := strings.TrimSpace(string(data))

// Symbolic ref: "ref: refs/heads/<branch>"
if strings.HasPrefix(ref, "ref: refs/heads/") {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Git repository is using refStorage=reftable, HEAD will point to ref: refs/heads/.invalid, instead of the correct branch. In that specific case, it may be worth shelling out to git.

Refs:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call out - updated to support this case

@evisdren evisdren requested a review from pjbgf February 20, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments