Stop paying for AI. This project indexes every provider that lets you use large language models at zero cost — whether through permanent free tiers, trial credits, or local execution on your own hardware.
The LLM landscape changes weekly. New providers launch free tiers, others sunset theirs, rate limits shift overnight. Keeping track manually is painful. Free-LLM solves this by maintaining a single source of truth covering 45+ providers, continuously updated by the community.
Detailed information for each provider — including models, pricing, code examples, and setup steps — is available at free-llm.com.
These providers offer ongoing free access with rate-limited quotas that never expire.
| Provider | Rate Limit | Daily Limit | Token Limit | Monthly Limit | Key Models |
|---|---|---|---|---|---|
| Google AI Studio | 2–15 RPM | 1,500 RPD (Flash) / 50 RPD (Pro) | 1M TPM (Flash) / 32K TPM (Pro) | Free of charge | Gemini 2.0 Flash, 1.5 Pro, 1.5 Flash |
| Groq | 30 RPM | 14,400 RPD | 40K TPM (varies) | Free forever | Llama 4 Maverick/Scout, Llama 3.3 70B, Qwen3 32B, Whisper |
| Cerebras | 30 RPM | 1,000,000 tokens/day | 60K–100K TPM | Free forever | Llama 3.1 8B, Llama 3.1 70B |
| HuggingFace Inference | 300 req/hour | Dependent on load | Max context of model | Free forever (rate-limited) | Llama 3.2 11B, Qwen 2.5 72B, Gemma 2 9B, Flux.1 |
| Cloudflare Workers AI | Varies by model | 10,000 neurons/day | Included in neuron budget | ~300K neurons/month | Llama 3.1 8B, Mistral 7B, Qwen 1.5 7B, DeepSeek Coder 6.7B, Phi-2 |
| Cohere | 20 RPM | — | — | 1,000 req/month | Command R+, Command R, Command R7B |
| Mistral (La Plateforme) | 1 req/s | — | 500K TPM / 1B tokens/month | Free (Experiment plan) | Mistral 7B, Mixtral 8x7B, Mistral Small, Mistral Nemo |
| OVH AI Endpoints | 2 RPM (anon) / 400 RPM (auth) | Unspecified | Unspecified | Beta access | Qwen3Guard 0.6B/8B, Stable Diffusion XL, TTS models |
| Chutes.ai | Varies (community) | Subject to availability | Free (community-powered) | No hard cap | DeepSeek-R1, Llama 3.1 70B, Qwen 2.5 72B |
| Inference.net | Varies | Fair use | Free for listed models | Fair use policy | DeepSeek-R1, Llama 3.1 8B/70B |
| Kluster.ai | Batch-based (async) | Generous batch quotas | Free for batch API | Subject to fair use | Llama 3.1 405B, DeepSeek-R1, Qwen 2.5 72B |
| Glhf.chat | Standard | Generous for personal use | Free tier included | Unlimited for free models | Llama 3.1 70B, Mixtral 8x7B, Phi-3 Mini |
| Coze | Varies by model | Token-based daily limits | Free daily tokens | Resets daily | GPT-4o (via Coze), Gemini 1.5 Pro (via Coze) |
| NVIDIA NIM | 40 RPM | — | — | — | Various open-source models (phone verification required) |
These providers give you credits that renew periodically.
| Provider | Rate Limit | Free Offer | Token Limit | Monthly Limit | Key Models |
|---|---|---|---|---|---|
| Grok / xAI | Varies (low for free tier) | Credit-based daily | $25/month renewing credits | $25/month (resets monthly) | Grok-2, Grok-2 Mini, Grok-2 Vision |
| OpenRouter | 20 RPM | 50 RPD (up to 1K w/ $10 topup) | Shared quota | — | Gemini 2.0, Llama 3.3 70B, DeepSeek R1, Phi-3 (20+ free models) |
| GitHub Models | Varies by Copilot tier | Low | Restrictive | — | GPT-4o, Llama 3.3 70B, Phi-4, Mistral Large, AI21 Jamba 1.5 |
| Venice.ai | Daily limits for free tier | Basic usage allowed | Limits without Pro | Resets daily | Llama 3.1 405B, Dolphin Mixtral, Stable Diffusion 3 |
Sign up and receive credits to use until depleted.
| Provider | Rate Limit | Credit Amount | Token Equivalent | Expiry | Key Models |
|---|---|---|---|---|---|
| Together.AI | Subject to availability | Free research models | Free (Apriel series) | Free forever (research) | Apriel 1.6/1.5 15B Thinker |
| DeepSeek | Standard | 10M free tokens | 10,000,000 tokens | One-time | DeepSeek-R1, DeepSeek-V3 |
| DeepInfra | 60 RPM | $5 credit | ~5M tokens (varies) | One-time | 40+ open-source models |
| SambaNova | Varies by model | $5 credit | ~30M Llama 8B tokens | One-time | Llama 3.1 405B/70B/8B, Qwen 2.5 72B |
| Cerebrium | Pay-per-second | $30 credit | Credit-based | One-time | Any deployable model |
| AI21 Labs | Standard | $10 credit | Credit-based | 3 months | Jamba models |
| Fireworks AI | Shared | $1 credit | One-time credit | One-time trial | Various open-source models |
| Friendli AI | Standard | $10 credit | Varies by model | One-time | Popular open-source models |
| Lepton AI | Varies | $10 credit | Credit-based | One-time trial | Llama, Mistral, Stable Diffusion |
| Hyperbolic | Standard | $1 credit | Credit-based | One-time trial | Llama 3.1 405B, DeepSeek V3 |
| Nebius | Standard | $1 credit | Credit-based | One-time trial | Various open-source models |
| Novita AI | Standard | $0.50 credit | Credit-based | One-time trial | Llama, Mistral |
| Replicate | Varies | Small trial credit | Credit-based | One-time trial | 1000+ models (LLMs, image, audio) |
| Upstage | Standard | $10 credit | Credit-based | 3 months | Solar Pro LLM |
| Qwen / Alibaba | Standard | 1M tokens/model (trial) | 1M tokens per model | One-time per model | Qwen family |
| Scaleway | Standard | 1M free tokens (trial) | 1M tokens | One-time trial | Mistral, Llama, Qwen (EU-hosted) |
| Yi AI | Standard | Initial trial credits | Credit-based | One-time trial | Yi-Large (200K context) |
| Requesty | Standard | Free monthly credits | Free monthly credits | Free tier included | Multi-provider routing |
Run on your own hardware — zero cost, zero rate limits, complete privacy.
| Tool | Rate Limit | Daily Limit | Token Limit | Monthly Limit | Highlights |
|---|---|---|---|---|---|
| Ollama | Hardware limited | Unlimited | Unlimited | Free | CLI-first, 100+ models, GPU accel, OpenAI-compatible API |
| LM Studio | Hardware limited | Unlimited | Unlimited | Free | Desktop GUI, any GGUF model, built-in model browser |
| GPT4All | Hardware dependent | Unlimited | Unlimited | Free open source | CPU-only chatbot, no GPU required |
| llama.cpp | Hardware dependent | Unlimited | Unlimited | Free open source | C/C++ engine, any GGUF model |
| Jan.ai | Hardware dependent | Unlimited | Unlimited | Free forever (open source) | Privacy-focused ChatGPT alternative, 100% offline |
| KoboldCpp | Hardware dependent | Unlimited | Unlimited | Free open source | Single-file GGUF engine for creative writing |
| llamafile | Hardware dependent | Unlimited | Unlimited | Free open source | Single executable, runs anywhere (Mozilla) |
| Text Gen WebUI | Hardware dependent | Unlimited | Unlimited | Free open source | Gradio interface for advanced local experimentation |
| BentoML | Hardware dependent | Unlimited | Unlimited | Free open source | Inference platform for deploying models anywhere |
Published at free-llm.com/guides:
- Best Free LLM APIs in 2026 — side-by-side comparison of top picks
- Gemini vs ChatGPT (Free Tier) — what you actually get for $0
- How to Use OpenRouter — setup walkthrough with code
- OpenRouter Alternatives — other aggregators worth trying
- Local LLMs with Ollama — get started in under 5 minutes
- Ultimate Free LLM API Guide — the comprehensive deep-dive
Free-LLM is community-driven. The website at free-llm.com lets visitors:
- Vote on providers to surface the most useful ones
- Submit new providers and models
- Propose edits to existing provider data (admin-reviewed)
- Earn recognition on the Hall of Fame leaderboard
Data syncs back to this repository automatically.
# Works with Groq, Cerebras, Grok, Together, DeepSeek, SambaNova...
# Just swap the base_url and api_key.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.groq.com/openai/v1" # or any OpenAI-compatible endpoint
)
response = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[{"role": "user", "content": "What makes LPU inference fast?"}]
)
print(response.choices[0].message.content)Most providers listed here support the OpenAI SDK — meaning you can switch between them by changing two lines.
- Add a provider — use the submit form on the website or open a PR.
- Vote & discuss — help the community surface the best options at free-llm.com.
MIT — see LICENSE for details.