Skip to content

37bytes/llm-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Service

Standalone LLM inference service с Ollama и LiteLLM Proxy.

Архитектура

┌─────────────────────────────────────────────┐
│              LLM Service                    │
├─────────────────────────────────────────────┤
│  litellm (port 4000)                        │
│    └── OpenAI-совместимый API               │
│    └── API key авторизация                  │
│                 │                           │
│                 ▼                           │
│  ollama (internal)                          │
│    └── qwen2.5:7b (или другие модели)       │
└─────────────────────────────────────────────┘

API

  • Endpoint: http://localhost:4000/v1/chat/completions
  • Авторизация: Authorization: Bearer <api_key>

Пример запроса

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-7b",
    "messages": [{"role": "user", "content": "Hello!"}],
    "temperature": 0.7
  }'

Создание API ключей

# С master key
curl -X POST "http://localhost:4000/key/generate" \
  -H "Authorization: Bearer sk-master-key" \
  -H "Content-Type: application/json" \
  -d '{"key_alias": "my-client"}'

Локальный запуск

cp .env.example .env
# Отредактировать .env

docker compose up -d

Деплой через Ansible

cd ansible
ansible-vault edit inventory/group_vars/vault.yml  # Задать vault_litellm_master_key
ansible-playbook playbooks/deploy.yml -i inventory/hosts --ask-vault-pass

Health check

curl http://localhost:4000/health

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages