Semantic Memory Infrastructure for AI Coding Assistants
Stop wasting tokens. Give your AI agents Git history context through vector embeddings and semantic search.
AI coding assistants waste tokens and miss context:
- 15-20K tokens to answer simple questions about codebase history
- No understanding of "why" code evolved the way it did
- Hallucinations from missing historical context
- Expensive token costs for enterprise teams
GitForAI transforms Git history into queryable semantic memory:
from gitforai import GitForAI
# Initialize with your repo
git_memory = GitForAI("/path/to/repo")
# Ask questions in natural language
results = git_memory.query("How does authentication work?")
# Returns relevant commits with 85% fewer tokens
# Track file evolution
history = git_memory.track_file("auth.py")
# See how a file changed over time
# Find similar changes
similar = git_memory.find_similar(commit_hash)
# Discover related work- π― 85% token reduction - Semantic search returns only relevant context
- π Privacy-first - Local embeddings, no API keys required
- β‘ Fast - Sub-second semantic search with ChromaDB
- π³ Self-hostable - Docker setup included
- π Pluggable - Extensible architecture for custom integrations
# Install from PyPI
pip install gitforai
# Or install from source
git clone https://github.com/git-for-ai/gitforai.git
cd gitforai
pip install -e .# Index repository with local embeddings (zero cost, offline)
gitforai index /path/to/repo
# Search your codebase semantically
gitforai search "authentication bug fixes"
# Get detailed commit info
gitforai analyze abc123 --diffs# Pull and run
docker pull gitforai/core
docker run -v $(pwd):/repo gitforai/core index /repo
# Or use docker-compose
docker-compose up- Extract - Parse Git commits, diffs, and file changes
- Embed - Generate semantic embeddings (local, no API cost)
- Index - Store in ChromaDB vector database
- Query - Natural language search returns relevant context
Result: AI agents get exactly the context they need, without wasting tokens on irrelevant code.
Platform-specific adapters available separately.
- Understand unfamiliar codebases quickly
- Find relevant commits when debugging
- Learn from code evolution patterns
- Reduce AI assistant token costs
- Onboard new developers faster
- Share institutional knowledge automatically
- Improve code review quality
- Standardize AI context across team
- Reduce token costs by 85%
- Self-host for data privacy
- Integrate with existing AI tools
- SOC2/GDPR compliant
- User Guide - Complete usage guide
- API Reference - Python API documentation
- Docker Guide - Self-hosting with Docker
- Contributing - How to contribute
- Extraction - GitPython for repository parsing
- Embeddings - sentence-transformers (local, free) or OpenAI (optional)
- Vector DB - ChromaDB for semantic search
- CLI - Typer for command-line interface
See CLAUDE.md for detailed architecture.
# Clone and setup
git clone https://github.com/git-for-ai/gitforai.git
cd gitforai
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=src/gitforai --cov-report=html
# Format code
black src/ tests/
# Lint
ruff check src/ tests/Create .env file:
# Embedding provider (default: local, zero cost)
EMBEDDING_PROVIDER=local # or "openai" for maximum quality
EMBEDDING_MODEL=all-MiniLM-L6-v2 # 384 dims, 80MB, free
# Vector database
VECTORDB_PROVIDER=chroma
VECTORDB_PERSIST_DIR=~/.gitforai/vectordb
# Optional: OpenAI API key (only if using OpenAI embeddings)
OPENAI_API_KEY=your-key-hereDefault settings use local embeddings:
- Works offline
- Zero API cost
- No API keys required
- ~88% of OpenAI quality
- Auto-downloads ~80MB model on first use
We believe semantic Git history should be accessible to all developers. The core extraction, embedding, and indexing logic is open source (MIT license).
Commercial offerings:
- Managed cloud hosting (Pro/Team tiers)
- Platform-specific integrations
- Enterprise features (SSO, RBAC, audit logs)
- Professional support
See gitforai.com for commercial options.
- Indexing: 1000 commits in <5 minutes
- Query: Semantic search in <500ms
- Accuracy: Relevant results in top 5 for 90% of queries
- Cost: $0.00 with local embeddings (vs $0.10/1000 commits with OpenAI)
MIT License - See LICENSE file for details.
We welcome contributions! See CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: support@gitforai.com
- Docs: docs.gitforai.com
- Core Git extraction and indexing
- Local embeddings (sentence-transformers)
- Vector database storage (ChromaDB)
- Incremental updates (2-6x faster)
- CLI interface
- Docker support
- Multi-repo support
- Real-time updates
- Advanced query optimization
- β Star this repo if you find it useful
- π Report bugs via GitHub Issues
- π‘ Request features via GitHub Discussions
- π€ Contribute - see CONTRIBUTING.md
Built with β€οΈ by the GitForAI team
Semantic memory for the AI coding revolution