An interactive multi-backend LLM runtime with intelligent cache eviction and persistent retrieval-augmented memory.
cli azure mcp gemini openai keyvault tensorrt smcp rag managed-identity kv-cache llm anthropic llama-cpp ollama tool-calling smart-evictions openai-server unlimited-context
-
Updated
Jan 31, 2026 - C++