Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton
k8s llama oracle-cloud model-serving model-as-a-service multi-node-kubernetes llm vllm llm-inference qwen deepseek sglang kimi-k2 pd-disaggregation
-
Updated
Jan 26, 2026 - Go