middLLMware is a middleware layer between your applications and Ollama API that enables request and response processing through a flexible plugin system.
Think of middLLMware as a transparent proxy that sits between your app and Ollama, allowing you to modify, enhance, or transform requests and responses on the fly.
User Application → middLLMware → Ollama → middLLMware → User Application
Instead of sending requests directly to Ollama, you send them to middLLMware, which processes them through your custom plugins before forwarding to Ollama and returning the results.
- System Prompts: Add consistent system prompts to all requests without modifying client code
- Auto-Translation: Automatically translate requests to English for better model performance, then translate responses back
- Request Logging: Log all interactions for debugging or analytics
- Content Filtering: Filter or modify sensitive content in requests/responses
- Model Routing: Route different request types to different models automatically
- Token Management: Track and limit token usage across your organization
- Caching: Cache common requests to reduce latency and API calls
// Without middLLMware
userRequest := "How do I center a div?"
// Ollama receives exactly this
// With middLLMware + SystemPromptPlugin
userRequest := "How do I center a div?"
// Ollama receives:
// "You are a senior web developer. Always provide modern best practices.
// User question: How do I center a div?"// User sends request in Russian
userRequest := "Как центрировать div?"
// middLLMware plugin translates to English
translatedRequest := "How do I center a div?"
// Ollama processes in English (better quality)
ollamaResponse := "You can center a div using flexbox..."
// middLLMware translates response back to Russian
finalResponse := "Вы можете центрировать div используя flexbox..."// Every request automatically logged
[2024-11-17 10:30:45] User: alice@example.com
Request: "Explain quantum computing"
Model: llama2
Tokens: 1250
Response Time: 2.3sgit clone https://github.com/yourusername/middLLMware.git
cd middLLMware
go build -o middLLMware ./cmd/api- Start Ollama on its default port (11434)
- Start middLLMware:
./middLLMware --port 8080 --ollama-url http://localhost:11434- Point your application to middLLMware instead of Ollama:
# Before
curl http://localhost:11434/api/generate -d '{"model": "llama2", "prompt": "Hello"}'
# After
curl http://localhost:8080/api/generate -d '{"model": "llama2", "prompt": "Hello"}'middLLMware uses a simple plugin interface that allows you to process requests and responses:
type Plugin interface {
// Called before sending request to Ollama
ProcessRequest(req *Request) (*Request, error)
// Called after receiving response from Ollama
ProcessResponse(resp *Response) (*Response, error)
}Create your own plugins by implementing this interface and registering them with the middleware.
# config.yaml
server:
port: 8080
ollama:
url: http://localhost:11434
timeout: 30s
plugins:
- name: system-prompt
enabled: true
config:
prompt: "You are a helpful assistant."
- name: translator
enabled: true
config:
source_lang: "ru"
target_lang: "en"
translate_back: true┌─────────────┐
│ Client │
└──────┬──────┘
│
▼
┌─────────────────────────┐
│ middLLMware Server │
│ ┌──────────────────┐ │
│ │ Request Chain │ │
│ │ Plugin 1 → 2 → 3│ │
│ └──────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Ollama Client │ │
│ └──────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Response Chain │ │
│ │ Plugin 3 → 2 → 1│ │
│ └──────────────────┘ │
└────────┬────────────────┘
│
▼
┌─────────────┐
│ Ollama │
└─────────────┘
Contributions are welcome! Feel free to:
- Add new plugins
- Improve documentation
- Report bugs
- Suggest features
MIT License - see LICENSE file for details