Automatically generate Agents.md for any GitHub / Local Repository. Long context enabled using dspy.RLM aka Recursive Language Models.
GenerateAgents.md analyzes local or GitHub repositories using Recursive Language Models (dspy.RLM) to produce optimized AGENTS.md files. It features deep codebase exploration, Git history-based anti-pattern deduction, and multiple output styles (Strict vs. Comprehensive) β supporting Gemini, Anthropic (Claude), and OpenAI models out of the box.
git clone https://github.com/originalankur/GenerateAgents.md
cd GenerateAgents.md
uv sync --extra dev # installs all deps + dev tools in one stepπ‘ Don't have
uv? Install it withcurl -LsSf https://astral.sh/uv/install.sh | shor see uv docs.
Copy the sample env file and fill in the key for your chosen provider:
cp .env.sample .env(Make sure the .env file sits directly in the root directory of the project, i.e., GenerateAgents.md/.env)
You only need one provider key β whichever model you select:
| Provider | Env Variable | Get a key |
|---|---|---|
| Gemini | GEMINI_API_KEY |
Google AI Studio |
| Anthropic | ANTHROPIC_API_KEY |
Anthropic Console |
| OpenAI | OPENAI_API_KEY |
OpenAI Platform |
# Default β generates AGENTS.md for a local repository (Gemini 2.5 Pro)
uv run autogenerateagentsmd /path/to/local/repo
# Analyze a public github repository using the flag
uv run autogenerateagentsmd --github-repository https://github.com/pallets/flask
# Choose a specific model
uv run autogenerateagentsmd /path/to/local/repo --model anthropic/claude-sonnet-4.6
uv run autogenerateagentsmd --github-repository https://github.com/pallets/flask --model openai/gpt-5.2
# Pass just the provider name to use its default model
uv run autogenerateagentsmd /path/to/local/repo --model anthropic
# List all supported models
uv run autogenerateagentsmd --list-models
# Interactive prompt (just run without arguments)
uv run autogenerateagentsmd
# Strict Style β Focus purely on strict code constraints, past failures, and repo quirks!
uv run autogenerateagentsmd --github-repository https://github.com/pallets/flask --style strict
# Analyze Git History β Automatically deduce anti-patterns from recently reverted commits
uv run autogenerateagentsmd /path/to/local/repo --analyze-git-history
uv run autogenerateagentsmd --github-repository https://github.com/pallets/flask --style strict --analyze-git-historyThe generated file will be saved under the projects/ directory using the repository name.
| Output | Location |
|---|---|
AGENTS.md |
./projects/<repo-name>/AGENTS.md |
GenerateAgents supports two distinct styles for AGENTS.md, each tailored to different AI agent setups. You can toggle between them using the --style flag.
Here are two examples generated for the flask repository:
- Strict Style Example (
--style strict) - Focuses purely on coding constraints, anti-patterns, and repository quirks. - Comprehensive Style Example (
--style comprehensive) - Includes high-level architectural overviews and explanations alongside constraints.
This builds a detailed, expansive guide. It extracts high-level abstractions like project architecture, directory mappings, data flow principles, and agent personas. Great for giving a brand-new AI agent a complete tour of the repository.
Output Format:
# AGENTS.md β <repo-name>
## Project Overview
## Agent Persona
## Tech Stack
## Architecture
## Code Style
## Anti-Patterns & Restrictions
## Database & State Management
## Error Handling & Logging
## Testing Commands
## Testing Guidelines
## Security & Compliance
## Dependencies & Environment
## PR & Git Rules
## Documentation Standards
## Common Patterns
## Agent Workflow / SOP
## Few-Shot ExamplesResearch suggests that broad, descriptive codebase summaries can sometimes distract LLMs and drive up token costs. The strict style combats this by giving the agent only what it can't easily grep for itself: strict constraints, undocumented quirks, and things it must never do.
Output Format:
# AGENTS.md β <repo-name>
## Code Style & Strict Rules
## Anti-Patterns & Restrictions
## Security & Compliance
## Lessons Learned (Past Failures)
## Repository Quirks & Gotchas
## Execution Commandsββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GenerateAgents Pipeline β
β β
β GitHub Repo URL β
β β β
β βΌ β
β ββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββ β
β β Clone βββββΆβ Load Source Tree (nested dict) β β
β β (git) β ββββββββββββββββββ¬ββββββββββββββββββββββββββ β
β ββββββββββββ β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β CodebaseConventionExtractor β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββ β β
β β β ExtractCodebaseInfo (RLM Pass) β β β
β β βββββββββββββββββββ¬βββββββββββββββββββ β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββ β β
β β β CompileConventionsMarkdown (CoT) β β β
β β βββββββββββββββββββ¬βββββββββββββββββββ β β
β ββββββββββββββββββββββΌββββββββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β AgentsMdCreator β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββ β β
β β β ExtractAgentsSections (CoT) β β β
β β β (Extracts 17 specific sections) β β β
β β βββββββββββββββββββ¬βββββββββββββββββββ β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββ β β
β β β compile_agents_md() (Python) β β β
β β β (Template matching into markdown) β β β
β β βββββββββββββββββββ¬βββββββββββββββββββ β β
β ββββββββββββββββββββββΌββββββββββββββββββββββ β
β βΌ β
β projects/<repo-name>/AGENTS.md β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
GenerateAgents/
βββ src/
β βββ autogenerateagentsmd/ # Core package directory
β βββ cli.py # CLI entry point β orchestrates the analysis pipeline
β βββ model_config.py # Provider registry, model catalog, and CLI argument parsing
β βββ signatures.py # DSPy Signatures (LM task definitions)
β β βββ ExtractCodebaseInfo # RLM: Extracts comprehensive codebase properties
β β βββ CompileConventionsMarkdown # CoT: Compiles RLM output into markdown
β β βββ ExtractAgentsSections # CoT: Translates conventions -> 17 AGENTS.md fields
β βββ modules.py # DSPy Modules (pipeline components)
β β βββ CodebaseConventionExtractor # Performs RLM extraction & markdown compilation
β β βββ AgentsMdCreator # Splits info & formats final AGENTS.md text
β βββ utils.py # Utility functions
β βββ clone_repo() # Shallow git clone
β βββ load_source_tree() # Recursively map directories to a nested dict
β βββ compile_agents_md() # Combines the 17 extracted fields into AGENTS.md
β βββ save_agents_to_disk() # Saves output to `projects/<repo_name>/`
βββ tests/
β βββ ... # Pytest test suite, executing end-to-end tests
βββ pyproject.toml # Project metadata, dependencies & tool config
βββ uv.lock # Reproducible dependency lock file
βββ .env.sample # Template for API keys
βββ .env # Your API keys (not committed)
| Variable | Required | Description |
|---|---|---|
GEMINI_API_KEY |
For Gemini | Google Gemini API key |
GOOGLE_API_KEY |
For Gemini | Alternative Gemini key name |
ANTHROPIC_API_KEY |
For Anthropic | Anthropic Claude API key |
OPENAI_API_KEY |
For OpenAI | OpenAI API key |
AUTOSKILL_MODEL |
No | Default model string (avoids --model flag) |
GITHUB_REPO_URL |
No | Target repository URL (skips prompt) |
Each provider has a primary model (used for main generation tasks) and a mini model (used as a sub-LM for faster RLM exploration):
| Provider | Primary (default) | Mini (sub-LM) |
|---|---|---|
| Gemini | gemini/gemini-2.5-pro |
gemini/gemini-2.5-flash |
| Anthropic | anthropic/claude-sonnet-4.6 |
anthropic/claude-haiku-3-20250519 |
| OpenAI | openai/gpt-5.2 |
openai/gpt-5.2-instant |
Run uv run autogenerateagentsmd --list-models for the full catalog of exact model versions supported.
The project includes an end-to-end test suite that typically runs the full pipeline against smaller codebases.
# Run all tests (uses AUTOSKILL_MODEL or defaults to Gemini)
uv run pytest tests/ -v -s
# Run only E2E tests
uv run pytest tests/ -v -s -m e2e
# Test with a specific provider
AUTOSKILL_MODEL=openai/gpt-5.2 uv run pytest tests/ -v -s -m e2e
# Run tests involving the generic clone function
uv run pytest tests/ -v -s -k "test_clone"
β οΈ Note: Full pipeline tests make real LLM API calls and may take a few minutes. Generated outputs from passing tests might be placed inside output directories.
- Support Local Repositories
- Test approach of providing tools to read_file, list_files, cat, grep and move away from sending the entire codebase to the LLM.