diff --git a/Makefile b/Makefile
index 9c9e86990..71fba51a1 100644
--- a/Makefile
+++ b/Makefile
@@ -42,16 +42,13 @@ schema:	## Generate OpenAPI schema file
 	uv run scripts/generate_openapi_schema.py docs/openapi.json
 
 openapi-doc:	docs/openapi.json scripts/fix_openapi_doc.py	## Generate OpenAPI documentation
-	openapi-to-markdown --input_file docs/openapi.json --output_file output.md
-	python3 scripts/fix_openapi_doc.py <  output.md > docs/output.md
+	uv run openapi-to-markdown --input_file docs/openapi.json --output_file output.md
+	uv run python3 scripts/fix_openapi_doc.py < output.md > docs/openapi.md
 	rm output.md
 
 generate-documentation:	## Generate documentation
 	scripts/gen_doc.py
 
-openapi-doc:	docs/openapi.json	## Generate OpenAPI documentation
-	openapi-to-markdown --input_file docs/openapi.json --output_file docs/output.md
-
 # TODO uv migration
 requirements.txt:	pyproject.toml pdm.lock ## Generate requirements.txt file containing hashes for all non-devel packages
 	pdm export --prod --format requirements --output requirements.txt --no-extras --without evaluation
diff --git a/docs/conversations_api.md b/docs/conversations_api.md
index e7496be16..df09ac4cf 100644
--- a/docs/conversations_api.md
+++ b/docs/conversations_api.md
@@ -21,10 +21,10 @@ This document explains how the Conversations API works with the Responses API in
    * [Continuing Existing Conversations](#continuing-existing-conversations)
    * [Conversation Storage](#conversation-storage)
 * [API Endpoints](#api-endpoints)
-   * [Query Endpoint (v2)](#query-endpoint-v2)
-   * [Streaming Query Endpoint (v2)](#streaming-query-endpoint-v2)
-   * [Conversations List Endpoint (v3)](#conversations-list-endpoint-v3)
-   * [Conversation Detail Endpoint (v3)](#conversation-detail-endpoint-v3)
+   * [Query Endpoint (v1)](#query-endpoint-v1)
+   * [Streaming Query Endpoint (v1)](#streaming-query-endpoint-v1)
+   * [Conversations List Endpoint (v1)](#conversations-list-endpoint-v1)
+   * [Conversation Detail Endpoint (v1)](#conversation-detail-endpoint-v1)
 * [Testing with curl](#testing-with-curl)
 * [Database Schema](#database-schema)
 * [Troubleshooting](#troubleshooting)
@@ -191,9 +191,9 @@ Stores user-specific metadata:
 
 ## API Endpoints
 
-### Query Endpoint (v2)
+### Query Endpoint (v1)
 
-**Endpoint:** `POST /v2/query`
+**Endpoint:** `POST /v1/query`
 
 **Request:**
 ```json
@@ -223,11 +223,11 @@ Stores user-specific metadata:
 > [!NOTE]
 > If `conversation_id` is omitted, a new conversation is automatically created and the new ID is returned in the response.
 
-### Streaming Query Endpoint (v2)
+### Streaming Query Endpoint (v1)
 
-**Endpoint:** `POST /v2/streaming_query`
+**Endpoint:** `POST /v1/streaming_query`
 
-**Request:** Same as `/v2/query`
+**Request:** Same as `/v1/query`
 
 **Response:** Server-Sent Events (SSE) stream
 
@@ -243,9 +243,9 @@ data: {"event": "turn_complete", "data": {"id": 10, "token": "The OpenShift Assi
 data: {"event": "end", "data": {"referenced_documents": [], "input_tokens": 150, "output_tokens": 200}}
 ```
 
-### Conversations List Endpoint (v3)
+### Conversations List Endpoint (v1)
 
-**Endpoint:** `GET /v3/conversations`
+**Endpoint:** `GET /v1/conversations`
 
 **Response:**
 ```json
@@ -264,9 +264,9 @@ data: {"event": "end", "data": {"referenced_documents": [], "input_tokens": 150,
 }
 ```
 
-### Conversation Detail Endpoint (v3)
+### Conversation Detail Endpoint (v1)
 
-**Endpoint:** `GET /v3/conversations/{conversation_id}`
+**Endpoint:** `GET /v1/conversations/{conversation_id}`
 
 **Response:**
 ```json
@@ -308,7 +308,7 @@ export TOKEN="<your-token>"
 To start a new conversation, omit the `conversation_id` field:
 
 ```bash
-curl -X POST http://localhost:8090/v2/query \
+curl -X POST http://localhost:8090/v1/query \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $TOKEN" \
   -d '{
@@ -338,7 +338,7 @@ curl -X POST http://localhost:8090/v2/query \
 To continue an existing conversation, include the `conversation_id` from a previous response:
 
 ```bash
-curl -X POST http://localhost:8090/v2/query \
+curl -X POST http://localhost:8090/v1/query \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $TOKEN" \
   -d '{
@@ -351,10 +351,10 @@ curl -X POST http://localhost:8090/v2/query \
 
 ### Streaming Query (New Conversation)
 
-For streaming responses, use the `/v2/streaming_query` endpoint. The response is returned as Server-Sent Events (SSE):
+For streaming responses, use the `/v1/streaming_query` endpoint. The response is returned as Server-Sent Events (SSE):
 
 ```bash
-curl -X POST http://localhost:8090/v2/streaming_query \
+curl -X POST http://localhost:8090/v1/streaming_query \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $TOKEN" \
   -H "Accept: text/event-stream" \
@@ -381,7 +381,7 @@ data: {"event": "end", "data": {"referenced_documents": [], "input_tokens": 150,
 ### Streaming Query (Continue Conversation)
 
 ```bash
-curl -X POST http://localhost:8090/v2/streaming_query \
+curl -X POST http://localhost:8090/v1/streaming_query \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $TOKEN" \
   -H "Accept: text/event-stream" \
@@ -396,7 +396,7 @@ curl -X POST http://localhost:8090/v2/streaming_query \
 ### List Conversations
 
 ```bash
-curl -X GET http://localhost:8090/v3/conversations \
+curl -X GET http://localhost:8090/v1/conversations \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $TOKEN"
 ```
@@ -404,7 +404,7 @@ curl -X GET http://localhost:8090/v3/conversations \
 ### Get Conversation Details
 
 ```bash
-curl -X GET http://localhost:8090/v3/conversations/0d21ba731f21f798dc9680125d5d6f493e4a7ab79f25670e \
+curl -X GET http://localhost:8090/v1/conversations/0d21ba731f21f798dc9680125d5d6f493e4a7ab79f25670e \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $TOKEN"
 ```
@@ -492,7 +492,7 @@ This is expected behavior. The Responses API v2 allows you to change the model/p
 ### Empty Conversation History
 
 **Symptom:**
-Calling `/v3/conversations/{conversation_id}` returns empty `chat_history`.
+Calling `/v1/conversations/{conversation_id}` returns empty `chat_history`.
 
 **Possible Causes:**
 1. The conversation was just created and has no messages yet
diff --git a/docs/openapi.json b/docs/openapi.json
index 9f2e9c83f..34ff97841 100644
--- a/docs/openapi.json
+++ b/docs/openapi.json
@@ -2941,6 +2941,594 @@
                 }
             }
         },
+        "/v2/query": {
+            "post": {
+                "tags": [
+                    "query_v1"
+                ],
+                "summary": "Query Endpoint Handler V1",
+                "description": "Handle request to the /query endpoint using Responses API.\n\nThis is a wrapper around query_endpoint_handler_base that provides\nthe Responses API specific retrieve_response and get_topic_summary functions.\n\nReturns:\n    QueryResponse: Contains the conversation ID and the LLM-generated response.",
+                "operationId": "query_endpoint_handler_v2_v2_query_post",
+                "requestBody": {
+                    "content": {
+                        "application/json": {
+                            "schema": {
+                                "$ref": "#/components/schemas/QueryRequest"
+                            }
+                        }
+                    },
+                    "required": true
+                },
+                "responses": {
+                    "200": {
+                        "description": "Successful response",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/QueryResponse"
+                                },
+                                "example": {
+                                    "available_quotas": {
+                                        "ClusterQuotaLimiter": 998911,
+                                        "UserQuotaLimiter": 998911
+                                    },
+                                    "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
+                                    "input_tokens": 123,
+                                    "output_tokens": 456,
+                                    "referenced_documents": [
+                                        {
+                                            "doc_title": "Operator Lifecycle Manager concepts and resources",
+                                            "doc_url": "https://docs.openshift.com/container-platform/4.15/operators/understanding/olm/olm-understanding-olm.html"
+                                        }
+                                    ],
+                                    "response": "Operator Lifecycle Manager (OLM) helps users install...",
+                                    "tool_calls": [
+                                        {
+                                            "args": {},
+                                            "id": "1",
+                                            "name": "tool1",
+                                            "type": "tool_call"
+                                        }
+                                    ],
+                                    "tool_results": [
+                                        {
+                                            "content": "bla",
+                                            "id": "1",
+                                            "round": 1,
+                                            "status": "success",
+                                            "type": "tool_result"
+                                        }
+                                    ],
+                                    "truncated": false
+                                }
+                            }
+                        }
+                    },
+                    "401": {
+                        "description": "Unauthorized",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/UnauthorizedResponse"
+                                },
+                                "examples": {
+                                    "missing header": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "No Authorization header found",
+                                                "response": "Missing or invalid credentials provided by client"
+                                            }
+                                        }
+                                    },
+                                    "missing token": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "No token found in Authorization header",
+                                                "response": "Missing or invalid credentials provided by client"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "403": {
+                        "description": "Permission denied",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/ForbiddenResponse"
+                                },
+                                "examples": {
+                                    "conversation read": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
+                                                "response": "User does not have permission to perform this action"
+                                            }
+                                        }
+                                    },
+                                    "endpoint": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 6789 is not authorized to access this endpoint.",
+                                                "response": "User does not have permission to access this endpoint"
+                                            }
+                                        }
+                                    },
+                                    "model override": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User lacks model_override permission required to override model/provider.",
+                                                "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "404": {
+                        "description": "Resource not found",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/NotFoundResponse"
+                                },
+                                "examples": {
+                                    "conversation": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
+                                                "response": "Conversation not found"
+                                            }
+                                        }
+                                    },
+                                    "provider": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Provider with ID openai does not exist",
+                                                "response": "Provider not found"
+                                            }
+                                        }
+                                    },
+                                    "model": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Model with ID gpt-4-turbo is not configured",
+                                                "response": "Model not found"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "422": {
+                        "description": "Request validation failed",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/UnprocessableEntityResponse"
+                                },
+                                "examples": {
+                                    "invalid format": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Invalid request format. The request body could not be parsed.",
+                                                "response": "Invalid request format"
+                                            }
+                                        }
+                                    },
+                                    "missing attributes": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Missing required attributes: ['query', 'model', 'provider']",
+                                                "response": "Missing required attributes"
+                                            }
+                                        }
+                                    },
+                                    "invalid value": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
+                                                "response": "Invalid attribute value"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "429": {
+                        "description": "Quota limit exceeded",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/QuotaExceededResponse"
+                                },
+                                "examples": {
+                                    "model": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "The token quota for model gpt-4-turbo has been exceeded.",
+                                                "response": "The model quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "user none": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 123 has no available tokens.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "cluster none": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Cluster has no available tokens.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "subject none": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Unknown subject 999 has no available tokens.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "user insufficient": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "cluster insufficient": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "subject insufficient": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "500": {
+                        "description": "Internal server error",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/InternalServerErrorResponse"
+                                },
+                                "examples": {
+                                    "configuration": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Lightspeed Stack configuration has not been initialized.",
+                                                "response": "Configuration is not loaded"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "503": {
+                        "description": "Service unavailable",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/ServiceUnavailableResponse"
+                                },
+                                "examples": {
+                                    "llama stack": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Connection error while trying to reach backend service.",
+                                                "response": "Unable to connect to Llama Stack"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    }
+                }
+            }
+        },
+        "/v2/streaming_query": {
+            "post": {
+                "tags": [
+                    "streaming_query_v1"
+                ],
+                "summary": "Streaming Query Endpoint Handler V1",
+                "description": "Handle request to the /streaming_query endpoint using Responses API.\n\nReturns a streaming response using Server-Sent Events (SSE) format with\ncontent type text/event-stream.\n\nReturns:\n    StreamingResponse: An HTTP streaming response yielding\n    SSE-formatted events for the query lifecycle with content type\n    text/event-stream.\n\nRaises:\n    HTTPException:\n        - 401: Unauthorized - Missing or invalid credentials\n        - 403: Forbidden - Insufficient permissions or model override not allowed\n        - 404: Not Found - Conversation, model, or provider not found\n        - 422: Unprocessable Entity - Request validation failed\n        - 429: Too Many Requests - Quota limit exceeded\n        - 500: Internal Server Error - Configuration not loaded or other server errors\n        - 503: Service Unavailable - Unable to connect to Llama Stack backend",
+                "operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
+                "requestBody": {
+                    "content": {
+                        "application/json": {
+                            "schema": {
+                                "$ref": "#/components/schemas/QueryRequest"
+                            }
+                        }
+                    },
+                    "required": true
+                },
+                "responses": {
+                    "200": {
+                        "description": "Successful response",
+                        "content": {
+                            "text/event-stream": {
+                                "schema": {
+                                    "type": "string",
+                                    "format": "text/event-stream"
+                                },
+                                "example": "data: {\"event\": \"start\", \"data\": {\"conversation_id\": \"123e4567-e89b-12d3-a456-426614174000\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 0, \"token\": \"No Violation\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 1, \"token\": \"\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 2, \"token\": \"Hello\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 3, \"token\": \"!\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 4, \"token\": \" How\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 5, \"token\": \" can\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 6, \"token\": \" I\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 7, \"token\": \" assist\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 8, \"token\": \" you\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 9, \"token\": \" today\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 10, \"token\": \"?\"}}\n\ndata: {\"event\": \"turn_complete\", \"data\": {\"token\": \"Hello! How can I assist you today?\"}}\n\ndata: {\"event\": \"end\", \"data\": {\"referenced_documents\": [], \"truncated\": null, \"input_tokens\": 11, \"output_tokens\": 19}, \"available_quotas\": {}}\n\n"
+                            }
+                        }
+                    },
+                    "401": {
+                        "description": "Unauthorized",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/UnauthorizedResponse"
+                                },
+                                "examples": {
+                                    "missing header": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "No Authorization header found",
+                                                "response": "Missing or invalid credentials provided by client"
+                                            }
+                                        }
+                                    },
+                                    "missing token": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "No token found in Authorization header",
+                                                "response": "Missing or invalid credentials provided by client"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "403": {
+                        "description": "Permission denied",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/ForbiddenResponse"
+                                },
+                                "examples": {
+                                    "conversation read": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
+                                                "response": "User does not have permission to perform this action"
+                                            }
+                                        }
+                                    },
+                                    "endpoint": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 6789 is not authorized to access this endpoint.",
+                                                "response": "User does not have permission to access this endpoint"
+                                            }
+                                        }
+                                    },
+                                    "model override": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User lacks model_override permission required to override model/provider.",
+                                                "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "404": {
+                        "description": "Resource not found",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/NotFoundResponse"
+                                },
+                                "examples": {
+                                    "conversation": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
+                                                "response": "Conversation not found"
+                                            }
+                                        }
+                                    },
+                                    "provider": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Provider with ID openai does not exist",
+                                                "response": "Provider not found"
+                                            }
+                                        }
+                                    },
+                                    "model": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Model with ID gpt-4-turbo is not configured",
+                                                "response": "Model not found"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "422": {
+                        "description": "Request validation failed",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/UnprocessableEntityResponse"
+                                },
+                                "examples": {
+                                    "invalid format": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Invalid request format. The request body could not be parsed.",
+                                                "response": "Invalid request format"
+                                            }
+                                        }
+                                    },
+                                    "missing attributes": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Missing required attributes: ['query', 'model', 'provider']",
+                                                "response": "Missing required attributes"
+                                            }
+                                        }
+                                    },
+                                    "invalid value": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
+                                                "response": "Invalid attribute value"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "429": {
+                        "description": "Quota limit exceeded",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/QuotaExceededResponse"
+                                },
+                                "examples": {
+                                    "model": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "The token quota for model gpt-4-turbo has been exceeded.",
+                                                "response": "The model quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "user none": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 123 has no available tokens.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "cluster none": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Cluster has no available tokens.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "subject none": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Unknown subject 999 has no available tokens.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "user insufficient": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "cluster insufficient": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    },
+                                    "subject insufficient": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
+                                                "response": "The quota has been exceeded"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "500": {
+                        "description": "Internal server error",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/InternalServerErrorResponse"
+                                },
+                                "examples": {
+                                    "configuration": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Lightspeed Stack configuration has not been initialized.",
+                                                "response": "Configuration is not loaded"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    },
+                    "503": {
+                        "description": "Service unavailable",
+                        "content": {
+                            "application/json": {
+                                "schema": {
+                                    "$ref": "#/components/schemas/ServiceUnavailableResponse"
+                                },
+                                "examples": {
+                                    "llama stack": {
+                                        "value": {
+                                            "detail": {
+                                                "cause": "Connection error while trying to reach backend service.",
+                                                "response": "Unable to connect to Llama Stack"
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    }
+                }
+            }
+        },
         "/v2/conversations": {
             "get": {
                 "tags": [
diff --git a/docs/openapi.md b/docs/openapi.md
index f32545e3a..edeb906c2 100644
--- a/docs/openapi.md
+++ b/docs/openapi.md
@@ -1020,12 +1020,12 @@ Examples
 | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
 ## POST `/v1/query`
 
-> **Query Endpoint Handler**
+> **Query Endpoint Handler V1**
 
-Handle request to the /query endpoint using Agent API.
+Handle request to the /query endpoint using Responses API.
 
 This is a wrapper around query_endpoint_handler_base that provides
-the Agent API specific retrieve_response and get_topic_summary functions.
+the Responses API specific retrieve_response and get_topic_summary functions.
 
 Returns:
     QueryResponse: Contains the conversation ID and the LLM-generated response.
@@ -1320,20 +1320,27 @@ Examples
  |
 ## POST `/v1/streaming_query`
 
-> **Streaming Query Endpoint Handler**
+> **Streaming Query Endpoint Handler V1**
 
-Handle request to the /streaming_query endpoint using Agent API.
+Handle request to the /streaming_query endpoint using Responses API.
 
-This is a wrapper around streaming_query_endpoint_handler_base that provides
-the Agent API specific retrieve_response and response generator functions.
+Returns a streaming response using Server-Sent Events (SSE) format with
+content type text/event-stream.
 
 Returns:
     StreamingResponse: An HTTP streaming response yielding
-    SSE-formatted events for the query lifecycle.
+    SSE-formatted events for the query lifecycle with content type
+    text/event-stream.
 
 Raises:
-    HTTPException: Returns HTTP 500 if unable to connect to the
-    Llama Stack server.
+    HTTPException:
+        - 401: Unauthorized - Missing or invalid credentials
+        - 403: Forbidden - Insufficient permissions or model override not allowed
+        - 404: Not Found - Conversation, model, or provider not found
+        - 422: Unprocessable Entity - Request validation failed
+        - 429: Too Many Requests - Quota limit exceeded
+        - 500: Internal Server Error - Configuration not loaded or other server errors
+        - 503: Service Unavailable - Unable to connect to Llama Stack backend
 
 
 
@@ -1347,7 +1354,7 @@ Raises:
 
 | Status Code | Description | Component |
 |-------------|-------------|-----------|
-| 200 | Streaming response (Server-Sent Events) | ...string |
+| 200 | Successful response | string |
 | 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
 
 Examples
@@ -1960,7 +1967,7 @@ Examples
 | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
 ## GET `/v1/conversations`
 
-> **Get Conversations List Endpoint Handler**
+> **Conversations List Endpoint Handler V1**
 
 Handle request to retrieve all conversations for the authenticated user.
 
@@ -2067,23 +2074,25 @@ Examples
  |
 ## GET `/v1/conversations/{conversation_id}`
 
-> **Get Conversation Endpoint Handler**
+> **Conversation Get Endpoint Handler V1**
 
-Handle request to retrieve a conversation by ID.
+Handle request to retrieve a conversation by ID using Conversations API.
 
-Retrieve a conversation's chat history by its ID. Then fetches
-the conversation session from the Llama Stack backend,
-simplifies the session data to essential chat history, and
-returns it in a structured response. Raises HTTP 400 for
-invalid IDs, 404 if not found, 503 if the backend is
-unavailable, and 500 for unexpected errors.
+Retrieve a conversation's chat history by its ID using the LlamaStack
+Conversations API. This endpoint fetches the conversation items from
+the backend, simplifies them to essential chat history, and returns
+them in a structured response. Raises HTTP 400 for invalid IDs, 404
+if not found, 503 if the backend is unavailable, and 500 for
+unexpected errors.
 
-Parameters:
-    conversation_id (str): Unique identifier of the conversation to retrieve.
+Args:
+    request: The FastAPI request object
+    conversation_id: Unique identifier of the conversation to retrieve
+    auth: Authentication tuple from dependency
 
 Returns:
     ConversationResponse: Structured response containing the conversation
-    ID and simplified chat history.
+    ID and simplified chat history
 
 
 
@@ -2240,17 +2249,22 @@ Examples
 | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
 ## DELETE `/v1/conversations/{conversation_id}`
 
-> **Delete Conversation Endpoint Handler**
+> **Conversation Delete Endpoint Handler V1**
 
-Handle request to delete a conversation by ID.
+Handle request to delete a conversation by ID using Conversations API.
 
 Validates the conversation ID format and attempts to delete the
-corresponding session from the Llama Stack backend. Raises HTTP
-errors for invalid IDs, not found conversations, connection
+conversation from the Llama Stack backend using the Conversations API.
+Raises HTTP errors for invalid IDs, not found conversations, connection
 issues, or unexpected failures.
 
+Args:
+    request: The FastAPI request object
+    conversation_id: Unique identifier of the conversation to delete
+    auth: Authentication tuple from dependency
+
 Returns:
-    ConversationDeleteResponse: Response indicating the result of the deletion operation.
+    ConversationDeleteResponse: Response indicating the result of the deletion operation
 
 
 
@@ -2365,23 +2379,6 @@ Examples
     "response": "User does not have permission to access this endpoint"
   }
 }
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
 ```
  |
 | 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
@@ -2431,101 +2428,23 @@ Examples
 ```
  |
 | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## GET `/v2/conversations`
-
-> **Get Conversations List Endpoint Handler**
-
-Handle request to retrieve all conversations for the authenticated user.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationsListResponseV2](#conversationslistresponsev2) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
+## PUT `/v1/conversations/{conversation_id}`
 
+> **Conversation Update Endpoint Handler V1**
 
+Handle request to update a conversation metadata using Conversations API.
 
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
-  }
-}
-```
- |
-## GET `/v2/conversations/{conversation_id}`
+Updates the conversation metadata (including topic summary) in both the
+LlamaStack backend using the Conversations API and the local database.
 
-> **Get Conversation Endpoint Handler**
+Args:
+    request: The FastAPI request object
+    conversation_id: Unique identifier of the conversation to update
+    update_request: Request containing the topic summary to update
+    auth: Authentication tuple from dependency
 
-Handle request to retrieve a conversation by ID.
+Returns:
+    ConversationUpdateResponse: Response indicating the result of the update operation
 
 
 
@@ -2536,11 +2455,15 @@ Handle request to retrieve a conversation by ID.
 | conversation_id | string | True |  |
 
 
+### 📦 Request Body 
+
+[ConversationUpdateRequest](#conversationupdaterequest)
+
 ### ✅ Responses
 
 | Status Code | Description | Component |
 |-------------|-------------|-----------|
-| 200 | Successful response | [ConversationResponse](#conversationresponse) |
+| 200 | Successful response | [ConversationUpdateResponse](#conversationupdateresponse) |
 | 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
 
 Examples
@@ -2644,33 +2567,56 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
+    "cause": "Failed to query the database",
+    "response": "Database query failed"
+  }
+}
+```
+ |
+| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
+
+Examples
+
+
+
+
+
+```json
+{
+  "detail": {
+    "cause": "Connection error while trying to reach backend service.",
+    "response": "Unable to connect to Llama Stack"
   }
 }
 ```
  |
 | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## DELETE `/v2/conversations/{conversation_id}`
+## POST `/v2/query`
 
-> **Delete Conversation Endpoint Handler**
+> **Query Endpoint Handler V1**
 
-Handle request to delete a conversation by ID.
+Handle request to the /query endpoint using Responses API.
 
+This is a wrapper around query_endpoint_handler_base that provides
+the Responses API specific retrieve_response and get_topic_summary functions.
+
+Returns:
+    QueryResponse: Contains the conversation ID and the LLM-generated response.
 
 
-### 🔗 Parameters
 
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| conversation_id | string | True |  |
 
 
+### 📦 Request Body 
+
+[QueryRequest](#queryrequest)
+
 ### ✅ Responses
 
 | Status Code | Description | Component |
 |-------------|-------------|-----------|
-| 200 | Successful response | [ConversationDeleteResponse](#conversationdeleteresponse)
+| 200 | Successful response | [QueryResponse](#queryresponse) |
+| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
 
 Examples
 
@@ -2680,9 +2626,10 @@ Examples
 
 ```json
 {
-  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
-  "response": "Conversation deleted successfully",
-  "success": true
+  "detail": {
+    "cause": "No Authorization header found",
+    "response": "Missing or invalid credentials provided by client"
+  }
 }
 ```
 
@@ -2691,13 +2638,14 @@ Examples
 
 ```json
 {
-  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
-  "response": "Conversation can not be deleted",
-  "success": true
+  "detail": {
+    "cause": "No token found in Authorization header",
+    "response": "Missing or invalid credentials provided by client"
+  }
 }
 ```
  |
-| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
+| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
 
 Examples
 
@@ -2708,16 +2656,11 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
-    "response": "Invalid conversation ID format"
+    "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
+    "response": "User does not have permission to perform this action"
   }
 }
 ```
- |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
 
 
 
@@ -2725,8 +2668,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
+    "cause": "User 6789 is not authorized to access this endpoint.",
+    "response": "User does not have permission to access this endpoint"
   }
 }
 ```
@@ -2737,13 +2680,13 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
+    "cause": "User lacks model_override permission required to override model/provider.",
+    "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
   }
 }
 ```
  |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
+| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
 
 Examples
 
@@ -2754,30 +2697,37 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
+    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
+    "response": "Conversation not found"
   }
 }
 ```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
 
-Examples
 
 
 
+```json
+{
+  "detail": {
+    "cause": "Provider with ID openai does not exist",
+    "response": "Provider not found"
+  }
+}
+```
+
+
 
 
 ```json
 {
   "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
+    "cause": "Model with ID gpt-4-turbo is not configured",
+    "response": "Model not found"
   }
 }
 ```
  |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
+| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse)
 
 Examples
 
@@ -2788,8 +2738,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
+    "cause": "Invalid request format. The request body could not be parsed.",
+    "response": "Invalid request format"
   }
 }
 ```
@@ -2800,41 +2750,52 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
+    "cause": "Missing required attributes: ['query', 'model', 'provider']",
+    "response": "Missing required attributes"
   }
 }
 ```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## PUT `/v2/conversations/{conversation_id}`
 
-> **Update Conversation Endpoint Handler**
 
-Handle request to update a conversation topic summary by ID.
 
 
+```json
+{
+  "detail": {
+    "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
+    "response": "Invalid attribute value"
+  }
+}
+```
+ |
+| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse)
 
-### 🔗 Parameters
+Examples
 
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| conversation_id | string | True |  |
 
 
-### 📦 Request Body 
 
-[ConversationUpdateRequest](#conversationupdaterequest)
 
-### ✅ Responses
+```json
+{
+  "detail": {
+    "cause": "The token quota for model gpt-4-turbo has been exceeded.",
+    "response": "The model quota has been exceeded"
+  }
+}
+```
+
 
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationUpdateResponse](#conversationupdateresponse) |
-| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
 
-Examples
 
+```json
+{
+  "detail": {
+    "cause": "User 123 has no available tokens.",
+    "response": "The quota has been exceeded"
+  }
+}
+```
 
 
 
@@ -2842,16 +2803,11 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
-    "response": "Invalid conversation ID format"
+    "cause": "Cluster has no available tokens.",
+    "response": "The quota has been exceeded"
   }
 }
 ```
- |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
 
 
 
@@ -2859,8 +2815,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
+    "cause": "Unknown subject 999 has no available tokens.",
+    "response": "The quota has been exceeded"
   }
 }
 ```
@@ -2871,16 +2827,11 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
+    "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
+    "response": "The quota has been exceeded"
   }
 }
 ```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
 
 
 
@@ -2888,16 +2839,11 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
+    "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
+    "response": "The quota has been exceeded"
   }
 }
 ```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
 
 
 
@@ -2905,8 +2851,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
+    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
+    "response": "The quota has been exceeded"
   }
 }
 ```
@@ -2927,6 +2873,11 @@ Examples
   }
 }
 ```
+ |
+| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
+
+Examples
+
 
 
 
@@ -2934,24 +2885,35 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
+    "cause": "Connection error while trying to reach backend service.",
+    "response": "Unable to connect to Llama Stack"
   }
 }
 ```
  |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## POST `/v2/query`
+## POST `/v2/streaming_query`
 
-> **Query Endpoint Handler V2**
+> **Streaming Query Endpoint Handler V1**
 
-Handle request to the /query endpoint using Responses API.
+Handle request to the /streaming_query endpoint using Responses API.
 
-This is a wrapper around query_endpoint_handler_base that provides
-the Responses API specific retrieve_response and get_topic_summary functions.
+Returns a streaming response using Server-Sent Events (SSE) format with
+content type text/event-stream.
 
 Returns:
-    QueryResponse: Contains the conversation ID and the LLM-generated response.
+    StreamingResponse: An HTTP streaming response yielding
+    SSE-formatted events for the query lifecycle with content type
+    text/event-stream.
+
+Raises:
+    HTTPException:
+        - 401: Unauthorized - Missing or invalid credentials
+        - 403: Forbidden - Insufficient permissions or model override not allowed
+        - 404: Not Found - Conversation, model, or provider not found
+        - 422: Unprocessable Entity - Request validation failed
+        - 429: Too Many Requests - Quota limit exceeded
+        - 500: Internal Server Error - Configuration not loaded or other server errors
+        - 503: Service Unavailable - Unable to connect to Llama Stack backend
 
 
 
@@ -2965,7 +2927,7 @@ Returns:
 
 | Status Code | Description | Component |
 |-------------|-------------|-----------|
-| 200 | Successful response | [QueryResponse](#queryresponse) |
+| 200 | Successful response | string |
 | 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
 
 Examples
@@ -3201,8 +3163,103 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
-    "response": "The quota has been exceeded"
+    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
+    "response": "The quota has been exceeded"
+  }
+}
+```
+ |
+| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
+
+Examples
+
+
+
+
+
+```json
+{
+  "detail": {
+    "cause": "Lightspeed Stack configuration has not been initialized.",
+    "response": "Configuration is not loaded"
+  }
+}
+```
+ |
+| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
+
+Examples
+
+
+
+
+
+```json
+{
+  "detail": {
+    "cause": "Connection error while trying to reach backend service.",
+    "response": "Unable to connect to Llama Stack"
+  }
+}
+```
+ |
+## GET `/v2/conversations`
+
+> **Get Conversations List Endpoint Handler**
+
+Handle request to retrieve all conversations for the authenticated user.
+
+
+
+
+
+### ✅ Responses
+
+| Status Code | Description | Component |
+|-------------|-------------|-----------|
+| 200 | Successful response | [ConversationsListResponseV2](#conversationslistresponsev2) |
+| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
+
+Examples
+
+
+
+
+
+```json
+{
+  "detail": {
+    "cause": "No Authorization header found",
+    "response": "Missing or invalid credentials provided by client"
+  }
+}
+```
+
+
+
+
+```json
+{
+  "detail": {
+    "cause": "No token found in Authorization header",
+    "response": "Missing or invalid credentials provided by client"
+  }
+}
+```
+ |
+| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
+
+Examples
+
+
+
+
+
+```json
+{
+  "detail": {
+    "cause": "User 6789 is not authorized to access this endpoint.",
+    "response": "User does not have permission to access this endpoint"
   }
 }
 ```
@@ -3223,11 +3280,6 @@ Examples
   }
 }
 ```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
 
 
 
@@ -3235,43 +3287,49 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
+    "cause": "Conversation cache is not configured or unavailable.",
+    "response": "Conversation cache not configured"
   }
 }
 ```
  |
-## POST `/v2/streaming_query`
+## GET `/v2/conversations/{conversation_id}`
 
-> **Streaming Query Endpoint Handler V2**
+> **Get Conversation Endpoint Handler**
 
-Handle request to the /streaming_query endpoint using Responses API.
+Handle request to retrieve a conversation by ID.
 
-This is a wrapper around streaming_query_endpoint_handler_base that provides
-the Responses API specific retrieve_response and response generator functions.
 
-Returns:
-    StreamingResponse: An HTTP streaming response yielding
-    SSE-formatted events for the query lifecycle.
 
-Raises:
-    HTTPException: Returns HTTP 500 if unable to connect to the
-    Llama Stack server.
+### 🔗 Parameters
 
+| Name | Type | Required | Description |
+|------|------|----------|-------------|
+| conversation_id | string | True |  |
 
 
+### ✅ Responses
 
+| Status Code | Description | Component |
+|-------------|-------------|-----------|
+| 200 | Successful response | [ConversationResponse](#conversationresponse) |
+| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
 
-### 📦 Request Body 
+Examples
 
-[QueryRequest](#queryrequest)
 
-### ✅ Responses
 
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Streaming response with Server-Sent Events | string
-string |
+
+
+```json
+{
+  "detail": {
+    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
+    "response": "Invalid conversation ID format"
+  }
+}
+```
+ |
 | 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
 
 Examples
@@ -3312,11 +3370,16 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
-    "response": "User does not have permission to perform this action"
+    "cause": "User 6789 is not authorized to access this endpoint.",
+    "response": "User does not have permission to access this endpoint"
   }
 }
 ```
+ |
+| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
+
+Examples
+
 
 
 
@@ -3324,8 +3387,25 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
+    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
+    "response": "Conversation not found"
+  }
+}
+```
+ |
+| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
+
+Examples
+
+
+
+
+
+```json
+{
+  "detail": {
+    "cause": "Lightspeed Stack configuration has not been initialized.",
+    "response": "Configuration is not loaded"
   }
 }
 ```
@@ -3336,13 +3416,33 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "User lacks model_override permission required to override model/provider.",
-    "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
+    "cause": "Conversation cache is not configured or unavailable.",
+    "response": "Conversation cache not configured"
   }
 }
 ```
  |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
+| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
+## DELETE `/v2/conversations/{conversation_id}`
+
+> **Delete Conversation Endpoint Handler**
+
+Handle request to delete a conversation by ID.
+
+
+
+### 🔗 Parameters
+
+| Name | Type | Required | Description |
+|------|------|----------|-------------|
+| conversation_id | string | True |  |
+
+
+### ✅ Responses
+
+| Status Code | Description | Component |
+|-------------|-------------|-----------|
+| 200 | Successful response | [ConversationDeleteResponse](#conversationdeleteresponse)
 
 Examples
 
@@ -3352,10 +3452,9 @@ Examples
 
 ```json
 {
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
+  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
+  "response": "Conversation deleted successfully",
+  "success": true
 }
 ```
 
@@ -3364,12 +3463,16 @@ Examples
 
 ```json
 {
-  "detail": {
-    "cause": "Provider with ID openai does not exist",
-    "response": "Provider not found"
-  }
+  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
+  "response": "Conversation can not be deleted",
+  "success": true
 }
 ```
+ |
+| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
+
+Examples
+
 
 
 
@@ -3377,13 +3480,13 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Model with ID gpt-4-turbo is not configured",
-    "response": "Model not found"
+    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
+    "response": "Invalid conversation ID format"
   }
 }
 ```
  |
-| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse)
+| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
 
 Examples
 
@@ -3394,8 +3497,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Invalid request format. The request body could not be parsed.",
-    "response": "Invalid request format"
+    "cause": "No Authorization header found",
+    "response": "Missing or invalid credentials provided by client"
   }
 }
 ```
@@ -3406,11 +3509,16 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Missing required attributes: ['query', 'model', 'provider']",
-    "response": "Missing required attributes"
+    "cause": "No token found in Authorization header",
+    "response": "Missing or invalid credentials provided by client"
   }
 }
 ```
+ |
+| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
+
+Examples
+
 
 
 
@@ -3418,13 +3526,13 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
-    "response": "Invalid attribute value"
+    "cause": "User 6789 is not authorized to access this endpoint.",
+    "response": "User does not have permission to access this endpoint"
   }
 }
 ```
  |
-| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse)
+| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
 
 Examples
 
@@ -3435,8 +3543,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "The token quota for model gpt-4-turbo has been exceeded.",
-    "response": "The model quota has been exceeded"
+    "cause": "Lightspeed Stack configuration has not been initialized.",
+    "response": "Configuration is not loaded"
   }
 }
 ```
@@ -3447,11 +3555,41 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "User 123 has no available tokens.",
-    "response": "The quota has been exceeded"
+    "cause": "Conversation cache is not configured or unavailable.",
+    "response": "Conversation cache not configured"
   }
 }
 ```
+ |
+| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
+## PUT `/v2/conversations/{conversation_id}`
+
+> **Update Conversation Endpoint Handler**
+
+Handle request to update a conversation topic summary by ID.
+
+
+
+### 🔗 Parameters
+
+| Name | Type | Required | Description |
+|------|------|----------|-------------|
+| conversation_id | string | True |  |
+
+
+### 📦 Request Body 
+
+[ConversationUpdateRequest](#conversationupdaterequest)
+
+### ✅ Responses
+
+| Status Code | Description | Component |
+|-------------|-------------|-----------|
+| 200 | Successful response | [ConversationUpdateResponse](#conversationupdateresponse) |
+| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
+
+Examples
+
 
 
 
@@ -3459,11 +3597,16 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Cluster has no available tokens.",
-    "response": "The quota has been exceeded"
+    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
+    "response": "Invalid conversation ID format"
   }
 }
 ```
+ |
+| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
+
+Examples
+
 
 
 
@@ -3471,8 +3614,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Unknown subject 999 has no available tokens.",
-    "response": "The quota has been exceeded"
+    "cause": "No Authorization header found",
+    "response": "Missing or invalid credentials provided by client"
   }
 }
 ```
@@ -3483,11 +3626,16 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
-    "response": "The quota has been exceeded"
+    "cause": "No token found in Authorization header",
+    "response": "Missing or invalid credentials provided by client"
   }
 }
 ```
+ |
+| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
+
+Examples
+
 
 
 
@@ -3495,11 +3643,16 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
-    "response": "The quota has been exceeded"
+    "cause": "User 6789 is not authorized to access this endpoint.",
+    "response": "User does not have permission to access this endpoint"
   }
 }
 ```
+ |
+| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
+
+Examples
+
 
 
 
@@ -3507,8 +3660,8 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
-    "response": "The quota has been exceeded"
+    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
+    "response": "Conversation not found"
   }
 }
 ```
@@ -3529,11 +3682,6 @@ Examples
   }
 }
 ```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
 
 
 
@@ -3541,12 +3689,13 @@ Examples
 ```json
 {
   "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
+    "cause": "Conversation cache is not configured or unavailable.",
+    "response": "Conversation cache not configured"
   }
 }
 ```
  |
+| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
 ## GET `/readiness`
 
 > **Readiness Probe Get Method**
@@ -3868,6 +4017,17 @@ Examples
 
 
 
+## APIKeyTokenConfiguration
+
+
+API Key Token configuration.
+
+
+| Field | Type | Description |
+|-------|------|-------------|
+| api_key | string |  |
+
+
 ## AccessRule
 
 
@@ -3931,6 +4091,7 @@ Authentication configuration.
 | k8s_cluster_api |  |  |
 | k8s_ca_cert_path |  |  |
 | jwk_config |  |  |
+| api_key_config |  |  |
 | rh_identity_config |  |  |
 
 
@@ -4546,11 +4707,11 @@ Useful resources:
 
 Model context protocol server configuration.
 
-MCP (Model Context Protocol) servers provide tools and
-capabilities to the AI agents. These are configured by this structure.
-Only MCP servers defined in the lightspeed-stack.yaml configuration are
-available to the agents. Tools configured in the llama-stack run.yaml
-are not accessible to lightspeed-core agents.
+MCP (Model Context Protocol) servers provide tools and capabilities to the
+AI agents. These are configured by this structure. Only MCP servers
+defined in the lightspeed-stack.yaml configuration are available to the
+agents. Tools configured in the llama-stack run.yaml are not accessible to
+lightspeed-core agents.
 
 Useful resources:
 
@@ -4594,9 +4755,9 @@ Model representing a response to models request.
 
 PostgreSQL database configuration.
 
-PostgreSQL database is used by Lightspeed Core Stack service for storing information about
-conversation IDs. It can also be leveraged to store conversation history and information
-about quota usage.
+PostgreSQL database is used by Lightspeed Core Stack service for storing
+information about conversation IDs. It can also be leveraged to store
+conversation history and information about quota usage.
 
 Useful resources:
 
@@ -4718,13 +4879,13 @@ Attributes:
 |-------|------|-------------|
 | conversation_id |  | The optional conversation ID (UUID) |
 | response | string | Response from LLM |
-| rag_chunks | array | List of RAG chunks used to generate the response |
-| tool_calls |  | List of tool calls made during response generation |
 | referenced_documents | array | List of documents referenced in generating the response |
 | truncated | boolean | Whether conversation history was truncated |
 | input_tokens | integer | Number of tokens sent to LLM |
 | output_tokens | integer | Number of tokens received from LLM |
 | available_quotas | object | Quota available as measured by all configured quota limiters |
+| tool_calls |  | List of tool calls made during response generation |
+| tool_results |  | List of tool results |
 
 
 ## QuotaExceededResponse
@@ -4803,19 +4964,6 @@ Quota scheduler configuration.
 | period | integer | Quota scheduler period specified in seconds |
 
 
-## RAGChunk
-
-
-Model representing a RAG chunk used in the response.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| content | string | The content of the chunk |
-| source |  | Source document or URL |
-| score |  | Relevance score |
-
-
 ## RAGInfoResponse
 
 
@@ -4906,10 +5054,10 @@ SQLite database configuration.
 
 Service configuration.
 
-Lightspeed Core Stack is a REST API service that accepts requests
-on a specified hostname and port. It is also possible to enable
-authentication and specify the number of Uvicorn workers. When more
-workers are specified, the service can handle requests concurrently.
+Lightspeed Core Stack is a REST API service that accepts requests on a
+specified hostname and port. It is also possible to enable authentication
+and specify the number of Uvicorn workers. When more workers are specified,
+the service can handle requests concurrently.
 
 
 | Field | Type | Description |
@@ -4988,17 +5136,33 @@ Useful resources:
 | tls_key_password |  | Path to file containing the password to decrypt the SSL/TLS private key. |
 
 
-## ToolCall
+## ToolCallSummary
+
+
+Model representing a tool call made during response generation (for tool_calls list).
+
+
+| Field | Type | Description |
+|-------|------|-------------|
+| id | string | ID of the tool call |
+| name | string | Name of the tool called |
+| args | object | Arguments passed to the tool |
+| type | string | Type indicator for tool call |
+
+
+## ToolResultSummary
 
 
-Model representing a tool call made during response generation.
+Model representing a result from a tool call (for tool_results list).
 
 
 | Field | Type | Description |
 |-------|------|-------------|
-| tool_name | string | Name of the tool called |
-| arguments | object | Arguments passed to the tool |
-| result |  | Result from the tool |
+| id | string | ID of the tool call/result, matches the corresponding tool call 'id' |
+| status | string | Status of the tool execution (e.g., 'success') |
+| content |  | Content/result returned from the tool |
+| type | string | Type indicator for tool result |
+| round | integer | Round number or step of tool execution |
 
 
 ## ToolsResponse
diff --git a/docs/output.md b/docs/output.md
deleted file mode 100644
index 5b87f2ef3..000000000
--- a/docs/output.md
+++ /dev/null
@@ -1,5073 +0,0 @@
-# Lightspeed Core Service (LCS) service - OpenAPI
-
-Lightspeed Core Service (LCS) service API specification.
-
-## 🌍 Base URL
-
-
-| URL | Description |
-|-----|-------------|
-| http://localhost:8080/ | Locally running service |
-
-
-# 🛠️ APIs
-
-## GET `/`
-
-> **Root Endpoint Handler**
-
-Handle GET requests to the root ("/") endpoint and returns the static HTML index page.
-
-Returns:
-    HTMLResponse: The HTML content of the index page, including a heading,
-    embedded image with the service icon, and links to the API documentation
-    via Swagger UI and ReDoc.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful Response | string |
-| 401 | Unauthorized | ...
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-[UnauthorizedResponse](#unauthorizedresponse) |
-| 403 | Permission denied | ...
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
-
-[ForbiddenResponse](#forbiddenresponse) |
-## GET `/v1/info`
-
-> **Info Endpoint Handler**
-
-Handle request to the /info endpoint.
-
-Process GET requests to the /info endpoint, returning the
-service name, version and Llama-stack version.
-
-Returns:
-    InfoResponse: An object containing the service's name and version.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [InfoResponse](#inforesponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Token has expired",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid token signature",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Token signed by unknown key",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Token missing claim: user_id",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid or expired Kubernetes token",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Authentication key server returned invalid data",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/models`
-
-> **Models Endpoint Handler**
-
-Handle requests to the /models endpoint.
-
-Process GET requests to the /models endpoint, returning a list of available
-models from the Llama Stack service.
-
-Raises:
-    HTTPException: If unable to connect to the Llama Stack server or if
-    model retrieval fails for any reason.
-
-Returns:
-    ModelsResponse: An object containing the list of available models.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ModelsResponse](#modelsresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/tools`
-
-> **Tools Endpoint Handler**
-
-Handle requests to the /tools endpoint.
-
-Process GET requests to the /tools endpoint, returning a consolidated list of
-available tools from all configured MCP servers.
-
-Raises:
-    HTTPException: If unable to connect to the Llama Stack server or if
-    tool retrieval fails for any reason.
-
-Returns:
-    ToolsResponse: An object containing the consolidated list of available tools
-    with metadata including tool name, description, parameters, and server source.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ToolsResponse](#toolsresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/shields`
-
-> **Shields Endpoint Handler**
-
-Handle requests to the /shields endpoint.
-
-Process GET requests to the /shields endpoint, returning a list of available
-shields from the Llama Stack service.
-
-Raises:
-    HTTPException: If unable to connect to the Llama Stack server or if
-    shield retrieval fails for any reason.
-
-Returns:
-    ShieldsResponse: An object containing the list of available shields.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ShieldsResponse](#shieldsresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/providers`
-
-> **Providers Endpoint Handler**
-
-List all available providers grouped by API type.
-
-Returns:
-    ProvidersListResponse: Mapping from API type to list of providers.
-
-Raises:
-    HTTPException:
-        - 401: Authentication failed
-        - 403: Authorization failed
-        - 500: Lightspeed Stack configuration not loaded
-        - 503: Unable to connect to Llama Stack
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ProvidersListResponse](#providerslistresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/providers/{provider_id}`
-
-> **Get Provider Endpoint Handler**
-
-Retrieve a single provider by its unique ID.
-
-Returns:
-    ProviderResponse: Provider details.
-
-Raises:
-    HTTPException:
-        - 401: Authentication failed
-        - 403: Authorization failed
-        - 404: Provider not found
-        - 500: Lightspeed Stack configuration not loaded
-        - 503: Unable to connect to Llama Stack
-
-
-
-### 🔗 Parameters
-
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| provider_id | string | True |  |
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ProviderResponse](#providerresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Provider with ID openai does not exist",
-    "response": "Provider not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## GET `/v1/rags`
-
-> **Rags Endpoint Handler**
-
-List all available RAGs.
-
-Returns:
-    RAGListResponse: List of RAG identifiers.
-
-Raises:
-    HTTPException:
-        - 401: Authentication failed
-        - 403: Authorization failed
-        - 500: Lightspeed Stack configuration not loaded
-        - 503: Unable to connect to Llama Stack
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [RAGListResponse](#raglistresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/rags/{rag_id}`
-
-> **Get Rag Endpoint Handler**
-
-Retrieve a single RAG by its unique ID.
-
-Returns:
-    RAGInfoResponse: A single RAG's details.
-
-Raises:
-    HTTPException:
-        - 401: Authentication failed
-        - 403: Authorization failed
-        - 404: RAG with the given ID not found
-        - 500: Lightspeed Stack configuration not loaded
-        - 503: Unable to connect to Llama Stack
-
-
-
-### 🔗 Parameters
-
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| rag_id | string | True |  |
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [RAGInfoResponse](#raginforesponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Rag with ID vs_7b52a8cf-0fa3-489c-beab-27e061d102f3 does not exist",
-    "response": "Rag not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## POST `/v1/query`
-
-> **Query Endpoint Handler**
-
-Handle request to the /query endpoint using Agent API.
-
-This is a wrapper around query_endpoint_handler_base that provides
-the Agent API specific retrieve_response and get_topic_summary functions.
-
-Returns:
-    QueryResponse: Contains the conversation ID and the LLM-generated response.
-
-
-
-
-
-### 📦 Request Body 
-
-[QueryRequest](#queryrequest)
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [QueryResponse](#queryresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
-    "response": "User does not have permission to perform this action"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User lacks model_override permission required to override model/provider.",
-    "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Provider with ID openai does not exist",
-    "response": "Provider not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Model with ID gpt-4-turbo is not configured",
-    "response": "Model not found"
-  }
-}
-```
- |
-| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid request format. The request body could not be parsed.",
-    "response": "Invalid request format"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Missing required attributes: ['query', 'model', 'provider']",
-    "response": "Missing required attributes"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
-    "response": "Invalid attribute value"
-  }
-}
-```
- |
-| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The token quota for model gpt-4-turbo has been exceeded.",
-    "response": "The model quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## POST `/v1/streaming_query`
-
-> **Streaming Query Endpoint Handler**
-
-Handle request to the /streaming_query endpoint using Agent API.
-
-This is a wrapper around streaming_query_endpoint_handler_base that provides
-the Agent API specific retrieve_response and response generator functions.
-
-Returns:
-    StreamingResponse: An HTTP streaming response yielding
-    SSE-formatted events for the query lifecycle.
-
-Raises:
-    HTTPException: Returns HTTP 500 if unable to connect to the
-    Llama Stack server.
-
-
-
-
-
-### 📦 Request Body 
-
-[QueryRequest](#queryrequest)
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Streaming response (Server-Sent Events) | ...string |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
-    "response": "User does not have permission to perform this action"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User lacks model_override permission required to override model/provider.",
-    "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Provider with ID openai does not exist",
-    "response": "Provider not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Model with ID gpt-4-turbo is not configured",
-    "response": "Model not found"
-  }
-}
-```
- |
-| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid request format. The request body could not be parsed.",
-    "response": "Invalid request format"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Missing required attributes: ['query', 'model', 'provider']",
-    "response": "Missing required attributes"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
-    "response": "Invalid attribute value"
-  }
-}
-```
- |
-| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The token quota for model gpt-4-turbo has been exceeded.",
-    "response": "The model quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/config`
-
-> **Config Endpoint Handler**
-
-Handle requests to the /config endpoint.
-
-Process GET requests to the /config endpoint and returns the
-current service configuration.
-
-Returns:
-    ConfigurationResponse: The loaded service configuration response.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConfigurationResponse](#configurationresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-## POST `/v1/feedback`
-
-> **Feedback Endpoint Handler**
-
-Handle feedback requests.
-
-Processes a user feedback submission, storing the feedback and
-returning a confirmation response.
-
-Args:
-    feedback_request: The request containing feedback information.
-    ensure_feedback_enabled: The feedback handler (FastAPI Depends) that
-        will handle feedback status checks.
-    auth: The Authentication handler (FastAPI Depends) that will
-        handle authentication Logic.
-
-Returns:
-    Response indicating the status of the feedback storage request.
-
-Raises:
-    HTTPException: Returns HTTP 500 if feedback storage fails.
-
-
-
-
-
-### 📦 Request Body 
-
-[FeedbackRequest](#feedbackrequest)
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [FeedbackResponse](#feedbackresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Storing feedback is disabled.",
-    "response": "Storing feedback is disabled"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Failed to store feedback at directory: /path/example",
-    "response": "Failed to store feedback"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## GET `/v1/feedback/status`
-
-> **Feedback Status**
-
-Handle feedback status requests.
-
-Return the current enabled status of the feedback
-functionality.
-
-Returns:
-    StatusResponse: Indicates whether feedback collection is enabled.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [StatusResponse](#statusresponse) |
-## PUT `/v1/feedback/status`
-
-> **Update Feedback Status**
-
-Handle feedback status update requests.
-
-Takes a request with the desired state of the feedback status.
-Returns the updated state of the feedback status based on the request's value.
-These changes are for the life of the service and are on a per-worker basis.
-
-Returns:
-    FeedbackStatusUpdateResponse: Indicates whether feedback is enabled.
-
-
-
-
-
-### 📦 Request Body 
-
-[FeedbackStatusUpdateRequest](#feedbackstatusupdaterequest)
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [FeedbackStatusUpdateResponse](#feedbackstatusupdateresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## GET `/v1/conversations`
-
-> **Get Conversations List Endpoint Handler**
-
-Handle request to retrieve all conversations for the authenticated user.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationsListResponse](#conversationslistresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Failed to query the database",
-    "response": "Database query failed"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/v1/conversations/{conversation_id}`
-
-> **Get Conversation Endpoint Handler**
-
-Handle request to retrieve a conversation by ID.
-
-Retrieve a conversation's chat history by its ID. Then fetches
-the conversation session from the Llama Stack backend,
-simplifies the session data to essential chat history, and
-returns it in a structured response. Raises HTTP 400 for
-invalid IDs, 404 if not found, 503 if the backend is
-unavailable, and 500 for unexpected errors.
-
-Parameters:
-    conversation_id (str): Unique identifier of the conversation to retrieve.
-
-Returns:
-    ConversationResponse: Structured response containing the conversation
-    ID and simplified chat history.
-
-
-
-### 🔗 Parameters
-
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| conversation_id | string | True |  |
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationResponse](#conversationresponse) |
-| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
-    "response": "Invalid conversation ID format"
-  }
-}
-```
- |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
-    "response": "User does not have permission to perform this action"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Failed to query the database",
-    "response": "Database query failed"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## DELETE `/v1/conversations/{conversation_id}`
-
-> **Delete Conversation Endpoint Handler**
-
-Handle request to delete a conversation by ID.
-
-Validates the conversation ID format and attempts to delete the
-corresponding session from the Llama Stack backend. Raises HTTP
-errors for invalid IDs, not found conversations, connection
-issues, or unexpected failures.
-
-Returns:
-    ConversationDeleteResponse: Response indicating the result of the deletion operation.
-
-
-
-### 🔗 Parameters
-
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| conversation_id | string | True |  |
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationDeleteResponse](#conversationdeleteresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
-  "response": "Conversation deleted successfully",
-  "success": true
-}
-```
-
-
-
-
-```json
-{
-  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
-  "response": "Conversation can not be deleted",
-  "success": true
-}
-```
- |
-| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
-    "response": "Invalid conversation ID format"
-  }
-}
-```
- |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 does not have permission to delete conversation with ID 123e4567-e89b-12d3-a456-426614174000",
-    "response": "User does not have permission to perform this action"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Failed to query the database",
-    "response": "Database query failed"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## GET `/v2/conversations`
-
-> **Get Conversations List Endpoint Handler**
-
-Handle request to retrieve all conversations for the authenticated user.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationsListResponseV2](#conversationslistresponsev2) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
-  }
-}
-```
- |
-## GET `/v2/conversations/{conversation_id}`
-
-> **Get Conversation Endpoint Handler**
-
-Handle request to retrieve a conversation by ID.
-
-
-
-### 🔗 Parameters
-
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| conversation_id | string | True |  |
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationResponse](#conversationresponse) |
-| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
-    "response": "Invalid conversation ID format"
-  }
-}
-```
- |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## DELETE `/v2/conversations/{conversation_id}`
-
-> **Delete Conversation Endpoint Handler**
-
-Handle request to delete a conversation by ID.
-
-
-
-### 🔗 Parameters
-
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| conversation_id | string | True |  |
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationDeleteResponse](#conversationdeleteresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
-  "response": "Conversation deleted successfully",
-  "success": true
-}
-```
-
-
-
-
-```json
-{
-  "conversation_id": "123e4567-e89b-12d3-a456-426614174000",
-  "response": "Conversation can not be deleted",
-  "success": true
-}
-```
- |
-| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
-    "response": "Invalid conversation ID format"
-  }
-}
-```
- |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## PUT `/v2/conversations/{conversation_id}`
-
-> **Update Conversation Endpoint Handler**
-
-Handle request to update a conversation topic summary by ID.
-
-
-
-### 🔗 Parameters
-
-| Name | Type | Required | Description |
-|------|------|----------|-------------|
-| conversation_id | string | True |  |
-
-
-### 📦 Request Body 
-
-[ConversationUpdateRequest](#conversationupdaterequest)
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ConversationUpdateResponse](#conversationupdateresponse) |
-| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.",
-    "response": "Invalid conversation ID format"
-  }
-}
-```
- |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation cache is not configured or unavailable.",
-    "response": "Conversation cache not configured"
-  }
-}
-```
- |
-| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) |
-## POST `/v2/query`
-
-> **Query Endpoint Handler V2**
-
-Handle request to the /query endpoint using Responses API.
-
-This is a wrapper around query_endpoint_handler_base that provides
-the Responses API specific retrieve_response and get_topic_summary functions.
-
-Returns:
-    QueryResponse: Contains the conversation ID and the LLM-generated response.
-
-
-
-
-
-### 📦 Request Body 
-
-[QueryRequest](#queryrequest)
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [QueryResponse](#queryresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
-    "response": "User does not have permission to perform this action"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User lacks model_override permission required to override model/provider.",
-    "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Provider with ID openai does not exist",
-    "response": "Provider not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Model with ID gpt-4-turbo is not configured",
-    "response": "Model not found"
-  }
-}
-```
- |
-| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid request format. The request body could not be parsed.",
-    "response": "Invalid request format"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Missing required attributes: ['query', 'model', 'provider']",
-    "response": "Missing required attributes"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
-    "response": "Invalid attribute value"
-  }
-}
-```
- |
-| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The token quota for model gpt-4-turbo has been exceeded.",
-    "response": "The model quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## POST `/v2/streaming_query`
-
-> **Streaming Query Endpoint Handler V2**
-
-Handle request to the /streaming_query endpoint using Responses API.
-
-This is a wrapper around streaming_query_endpoint_handler_base that provides
-the Responses API specific retrieve_response and response generator functions.
-
-Returns:
-    StreamingResponse: An HTTP streaming response yielding
-    SSE-formatted events for the query lifecycle.
-
-Raises:
-    HTTPException: Returns HTTP 500 if unable to connect to the
-    Llama Stack server.
-
-
-
-
-
-### 📦 Request Body 
-
-[QueryRequest](#queryrequest)
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Streaming response with Server-Sent Events | string
-string |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
-    "response": "User does not have permission to perform this action"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User lacks model_override permission required to override model/provider.",
-    "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
-  }
-}
-```
- |
-| 404 | Resource not found | [NotFoundResponse](#notfoundresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
-    "response": "Conversation not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Provider with ID openai does not exist",
-    "response": "Provider not found"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Model with ID gpt-4-turbo is not configured",
-    "response": "Model not found"
-  }
-}
-```
- |
-| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid request format. The request body could not be parsed.",
-    "response": "Invalid request format"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Missing required attributes: ['query', 'model', 'provider']",
-    "response": "Missing required attributes"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
-    "response": "Invalid attribute value"
-  }
-}
-```
- |
-| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "The token quota for model gpt-4-turbo has been exceeded.",
-    "response": "The model quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has no available tokens.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 123 has 5 tokens, but 10 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Cluster has 500 tokens, but 900 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
-    "response": "The quota has been exceeded"
-  }
-}
-```
- |
-| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/readiness`
-
-> **Readiness Probe Get Method**
-
-Handle the readiness probe endpoint, returning service readiness.
-
-If any provider reports an error status, responds with HTTP 503
-and details of unhealthy providers; otherwise, indicates the
-service is ready.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [ReadinessResponse](#readinessresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
- |
-## GET `/liveness`
-
-> **Liveness Probe Get Method**
-
-Return the liveness status of the service.
-
-Returns:
-    LivenessResponse: Indicates that the service is alive.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [LivenessResponse](#livenessresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-## POST `/authorized`
-
-> **Authorized Endpoint Handler**
-
-Handle request to the /authorized endpoint.
-
-Process POST requests to the /authorized endpoint, returning
-the authenticated user's ID and username.
-
-Returns:
-    AuthorizedResponse: Contains the user ID and username of the authenticated user.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful response | [AuthorizedResponse](#authorizedresponse) |
-| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
- |
-| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse)
-
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
- |
-## GET `/metrics`
-
-> **Metrics Endpoint Handler**
-
-Handle request to the /metrics endpoint.
-
-Process GET requests to the /metrics endpoint, returning the
-latest Prometheus metrics in form of a plain text.
-
-Initializes model metrics on the first request if not already
-set up, then responds with the current metrics snapshot in
-Prometheus format.
-
-
-
-
-
-### ✅ Responses
-
-| Status Code | Description | Component |
-|-------------|-------------|-----------|
-| 200 | Successful Response | string |
-| 401 | Unauthorized | ...
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No Authorization header found",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "No token found in Authorization header",
-    "response": "Missing or invalid credentials provided by client"
-  }
-}
-```
-
-[UnauthorizedResponse](#unauthorizedresponse) |
-| 403 | Permission denied | ...
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "User 6789 is not authorized to access this endpoint.",
-    "response": "User does not have permission to access this endpoint"
-  }
-}
-```
-
-[ForbiddenResponse](#forbiddenresponse) |
-| 500 | Internal server error | ...
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Lightspeed Stack configuration has not been initialized.",
-    "response": "Configuration is not loaded"
-  }
-}
-```
-
-[InternalServerErrorResponse](#internalservererrorresponse) |
-| 503 | Service unavailable | ...
-Examples
-
-
-
-
-
-```json
-{
-  "detail": {
-    "cause": "Connection error while trying to reach backend service.",
-    "response": "Unable to connect to Llama Stack"
-  }
-}
-```
-
-[ServiceUnavailableResponse](#serviceunavailableresponse) |
----
-
-# 📋 Components
-
-
-
-## APIKeyTokenConfiguration
-
-
-API Key Token configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| api_key | string |  |
-
-
-## AccessRule
-
-
-Rule defining what actions a role can perform.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| role | string | Name of the role |
-| actions | array | Allowed actions for this role |
-
-
-## Action
-
-
-Available actions in the system.
-
-Note: this is not a real model, just an enumeration of all action names.
-
-
-
-
-## Attachment
-
-
-Model representing an attachment that can be send from the UI as part of query.
-
-A list of attachments can be an optional part of 'query' request.
-
-Attributes:
-    attachment_type: The attachment type, like "log", "configuration" etc.
-    content_type: The content type as defined in MIME standard
-    content: The actual attachment content
-
-YAML attachments with **kind** and **metadata/name** attributes will
-be handled as resources with the specified name:
-```
-kind: Pod
-metadata:
-    name: private-reg
-```
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| attachment_type | string | The attachment type, like 'log', 'configuration' etc. |
-| content_type | string | The content type as defined in MIME standard |
-| content | string | The actual attachment content |
-
-
-## AuthenticationConfiguration
-
-
-Authentication configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| module | string |  |
-| skip_tls_verification | boolean |  |
-| k8s_cluster_api |  |  |
-| k8s_ca_cert_path |  |  |
-| jwk_config |  |  |
-| api_key_config |  |  |
-| rh_identity_config |  |  |
-
-
-## AuthorizationConfiguration
-
-
-Authorization configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| access_rules | array | Rules for role-based access control |
-
-
-## AuthorizedResponse
-
-
-Model representing a response to an authorization request.
-
-Attributes:
-    user_id: The ID of the logged in user.
-    username: The name of the logged in user.
-    skip_userid_check: Whether to skip the user ID check.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| user_id | string | User ID, for example UUID |
-| username | string | User name |
-| skip_userid_check | boolean | Whether to skip the user ID check |
-
-
-## BadRequestResponse
-
-
-400 Bad Request. Invalid resource identifier.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## ByokRag
-
-
-BYOK (Bring Your Own Knowledge) RAG configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| rag_id | string | Unique RAG ID |
-| rag_type | string | Type of RAG database. |
-| embedding_model | string | Embedding model identification |
-| embedding_dimension | integer | Dimensionality of embedding vectors. |
-| vector_db_id | string | Vector DB identification. |
-| db_path | string | Path to RAG database. |
-
-
-## CORSConfiguration
-
-
-CORS configuration.
-
-CORS or 'Cross-Origin Resource Sharing' refers to the situations when a
-frontend running in a browser has JavaScript code that communicates with a
-backend, and the backend is in a different 'origin' than the frontend.
-
-Useful resources:
-
-  - [CORS in FastAPI](https://fastapi.tiangolo.com/tutorial/cors/)
-  - [Wikipedia article](https://en.wikipedia.org/wiki/Cross-origin_resource_sharing)
-  - [What is CORS?](https://dev.to/akshay_chauhan/what-is-cors-explained-8f1)
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| allow_origins | array | A list of origins allowed for cross-origin requests. An origin is the combination of protocol (http, https), domain (myapp.com, localhost, localhost.tiangolo.com), and port (80, 443, 8080). Use ['*'] to allow all origins. |
-| allow_credentials | boolean | Indicate that cookies should be supported for cross-origin requests |
-| allow_methods | array | A list of HTTP methods that should be allowed for cross-origin requests. You can use ['*'] to allow all standard methods. |
-| allow_headers | array | A list of HTTP request headers that should be supported for cross-origin requests. You can use ['*'] to allow all headers. The Accept, Accept-Language, Content-Language and Content-Type headers are always allowed for simple CORS requests. |
-
-
-## Configuration
-
-
-Global service configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| name | string | Name of the service. That value will be used in REST API endpoints. |
-| service |  | This section contains Lightspeed Core Stack service configuration. |
-| llama_stack |  | This section contains Llama Stack configuration. Lightspeed Core Stack service can call Llama Stack in library mode or in server mode. |
-| user_data_collection |  | This section contains configuration for subsystem that collects user data(transcription history and feedbacks). |
-| database |  | Configuration for database to store conversation IDs and other runtime data |
-| mcp_servers | array | MCP (Model Context Protocol) servers provide tools and capabilities to the AI agents. These are configured in this section. Only MCP servers defined in the lightspeed-stack.yaml configuration are available to the agents. Tools configured in the llama-stack run.yaml are not accessible to lightspeed-core agents. |
-| authentication |  | Authentication configuration |
-| authorization |  | Lightspeed Core Stack implements a modular authentication and authorization system with multiple authentication methods. Authorization is configurable through role-based access control. Authentication is handled through selectable modules configured via the module field in the authentication configuration. |
-| customization |  | It is possible to customize Lightspeed Core Stack via this section. System prompt can be customized and also different parts of the service can be replaced by custom Python modules. |
-| inference |  | One LLM provider and one its model might be selected as default ones. When no provider+model pair is specified in REST API calls (query endpoints), the default provider and model are used. |
-| conversation_cache |  |  |
-| byok_rag | array | BYOK RAG configuration. This configuration can be used to reconfigure Llama Stack through its run.yaml configuration file |
-| quota_handlers |  | Quota handlers configuration |
-
-
-## ConfigurationResponse
-
-
-Success response model for the config endpoint.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| configuration |  |  |
-
-
-## ConversationData
-
-
-Model representing conversation data returned by cache list operations.
-
-Attributes:
-    conversation_id: The conversation ID
-    topic_summary: The topic summary for the conversation (can be None)
-    last_message_timestamp: The timestamp of the last message in the conversation
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversation_id | string |  |
-| topic_summary |  |  |
-| last_message_timestamp | number |  |
-
-
-## ConversationDeleteResponse
-
-
-Model representing a response for deleting a conversation.
-
-Attributes:
-    conversation_id: The conversation ID (UUID) that was deleted.
-    success: Whether the deletion was successful.
-    response: A message about the deletion result.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversation_id | string | The conversation ID (UUID) that was deleted. |
-| success | boolean | Whether the deletion was successful. |
-| response | string | A message about the deletion result. |
-
-
-## ConversationDetails
-
-
-Model representing the details of a user conversation.
-
-Attributes:
-    conversation_id: The conversation ID (UUID).
-    created_at: When the conversation was created.
-    last_message_at: When the last message was sent.
-    message_count: Number of user messages in the conversation.
-    last_used_model: The last model used for the conversation.
-    last_used_provider: The provider of the last used model.
-    topic_summary: The topic summary for the conversation.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversation_id | string | Conversation ID (UUID) |
-| created_at |  | When the conversation was created |
-| last_message_at |  | When the last message was sent |
-| message_count |  | Number of user messages in the conversation |
-| last_used_model |  | Identification of the last model used for the conversation |
-| last_used_provider |  | Identification of the last provider used for the conversation |
-| topic_summary |  | Topic summary for the conversation |
-
-
-## ConversationHistoryConfiguration
-
-
-Conversation history configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| type |  | Type of database where the conversation history is to be stored. |
-| memory |  | In-memory cache configuration |
-| sqlite |  | SQLite database configuration |
-| postgres |  | PostgreSQL database configuration |
-
-
-## ConversationResponse
-
-
-Model representing a response for retrieving a conversation.
-
-Attributes:
-    conversation_id: The conversation ID (UUID).
-    chat_history: The simplified chat history as a list of conversation turns.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversation_id | string | Conversation ID (UUID) |
-| chat_history | array | The simplified chat history as a list of conversation turns |
-
-
-## ConversationUpdateRequest
-
-
-Model representing a request to update a conversation topic summary.
-
-Attributes:
-    topic_summary: The new topic summary for the conversation.
-
-Example:
-    ```python
-    update_request = ConversationUpdateRequest(
-        topic_summary="Discussion about machine learning algorithms"
-    )
-    ```
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| topic_summary | string | The new topic summary for the conversation |
-
-
-## ConversationUpdateResponse
-
-
-Model representing a response for updating a conversation topic summary.
-
-Attributes:
-    conversation_id: The conversation ID (UUID) that was updated.
-    success: Whether the update was successful.
-    message: A message about the update result.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversation_id | string | The conversation ID (UUID) that was updated |
-| success | boolean | Whether the update was successful |
-| message | string | A message about the update result |
-
-
-## ConversationsListResponse
-
-
-Model representing a response for listing conversations of a user.
-
-Attributes:
-    conversations: List of conversation details associated with the user.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversations | array |  |
-
-
-## ConversationsListResponseV2
-
-
-Model representing a response for listing conversations of a user.
-
-Attributes:
-    conversations: List of conversation data associated with the user.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversations | array |  |
-
-
-## CustomProfile
-
-
-Custom profile customization for prompts and validation.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| path | string | Path to Python modules containing custom profile. |
-| prompts | object | Dictionary containing map of system prompts |
-
-
-## Customization
-
-
-Service customization.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| profile_path |  |  |
-| disable_query_system_prompt | boolean |  |
-| system_prompt_path |  |  |
-| system_prompt |  |  |
-| custom_profile |  |  |
-
-
-## DatabaseConfiguration
-
-
-Database configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| sqlite |  | SQLite database configuration |
-| postgres |  | PostgreSQL database configuration |
-
-
-## DetailModel
-
-
-Nested detail model for error responses.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| response | string | Short summary of the error |
-| cause | string | Detailed explanation of what caused the error |
-
-
-## FeedbackCategory
-
-
-Enum representing predefined feedback categories for AI responses.
-
-These categories help provide structured feedback about AI inference quality
-when users provide negative feedback (thumbs down). Multiple categories can
-be selected to provide comprehensive feedback about response issues.
-
-
-
-
-## FeedbackRequest
-
-
-Model representing a feedback request.
-
-Attributes:
-    conversation_id: The required conversation ID (UUID).
-    user_question: The required user question.
-    llm_response: The required LLM response.
-    sentiment: The optional sentiment.
-    user_feedback: The optional user feedback.
-    categories: The optional list of feedback categories (multi-select for negative feedback).
-
-Example:
-    ```python
-    feedback_request = FeedbackRequest(
-        conversation_id="12345678-abcd-0000-0123-456789abcdef",
-        user_question="what are you doing?",
-        user_feedback="This response is not helpful",
-        llm_response="I don't know",
-        sentiment=-1,
-        categories=[FeedbackCategory.INCORRECT, FeedbackCategory.INCOMPLETE]
-    )
-    ```
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversation_id | string | The required conversation ID (UUID) |
-| user_question | string | User question (the query string) |
-| llm_response | string | Response from LLM |
-| sentiment |  | User sentiment, if provided must be -1 or 1 |
-| user_feedback |  | Feedback on the LLM response. |
-| categories |  | List of feedback categories that describe issues with the LLM response (for negative feedback). |
-
-
-## FeedbackResponse
-
-
-Model representing a response to a feedback request.
-
-Attributes:
-    response: The response of the feedback request.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| response | string | The response of the feedback request. |
-
-
-## FeedbackStatusUpdateRequest
-
-
-Model representing a feedback status update request.
-
-Attributes:
-    status: Value of the desired feedback enabled state.
-
-Example:
-    ```python
-    feedback_request = FeedbackRequest(
-        status=false
-    )
-    ```
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status | boolean | Desired state of feedback enablement, must be False or True |
-
-
-## FeedbackStatusUpdateResponse
-
-
-Model representing a response to a feedback status update request.
-
-Attributes:
-    status: The previous and current status of the service and who updated it.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status | object |  |
-
-
-## ForbiddenResponse
-
-
-403 Forbidden. Access denied.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## HTTPValidationError
-
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| detail | array |  |
-
-
-## InMemoryCacheConfig
-
-
-In-memory cache configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| max_entries | integer | Maximum number of entries stored in the in-memory cache |
-
-
-## InferenceConfiguration
-
-
-Inference configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| default_model |  | Identification of default model used when no other model is specified. |
-| default_provider |  | Identification of default provider used when no other model is specified. |
-
-
-## InfoResponse
-
-
-Model representing a response to an info request.
-
-Attributes:
-    name: Service name.
-    service_version: Service version.
-    llama_stack_version: Llama Stack version.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| name | string | Service name |
-| service_version | string | Service version |
-| llama_stack_version | string | Llama Stack version |
-
-
-## InternalServerErrorResponse
-
-
-500 Internal Server Error.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## JsonPathOperator
-
-
-Supported operators for JSONPath evaluation.
-
-Note: this is not a real model, just an enumeration of all supported JSONPath operators.
-
-
-
-
-## JwkConfiguration
-
-
-JWK (JSON Web Key) configuration.
-
-A JSON Web Key (JWK) is a JavaScript Object Notation (JSON) data structure
-that represents a cryptographic key.
-
-Useful resources:
-
-  - [JSON Web Key](https://openid.net/specs/draft-jones-json-web-key-03.html)
-  - [RFC 7517](https://www.rfc-editor.org/rfc/rfc7517)
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| url | string | HTTPS URL of the JWK (JSON Web Key) set used to validate JWTs. |
-| jwt_configuration |  | JWT (JSON Web Token) configuration |
-
-
-## JwtConfiguration
-
-
-JWT (JSON Web Token) configuration.
-
-JSON Web Token (JWT) is a compact, URL-safe means of representing
-claims to be transferred between two parties.  The claims in a JWT
-are encoded as a JSON object that is used as the payload of a JSON
-Web Signature (JWS) structure or as the plaintext of a JSON Web
-Encryption (JWE) structure, enabling the claims to be digitally
-signed or integrity protected with a Message Authentication Code
-(MAC) and/or encrypted.
-
-Useful resources:
-
-  - [JSON Web Token](https://en.wikipedia.org/wiki/JSON_Web_Token)
-  - [RFC 7519](https://datatracker.ietf.org/doc/html/rfc7519)
-  - [JSON Web Tokens](https://auth0.com/docs/secure/tokens/json-web-tokens)
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| user_id_claim | string | JWT claim name that uniquely identifies the user (subject ID). |
-| username_claim | string | JWT claim name that provides the human-readable username. |
-| role_rules | array | Rules for extracting roles from JWT claims |
-
-
-## JwtRoleRule
-
-
-Rule for extracting roles from JWT claims.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| jsonpath | string | JSONPath expression to evaluate against the JWT payload |
-| operator |  | JSON path comparison operator |
-| negate | boolean | If set to true, the meaning of the rule is negated |
-| value |  | Value to compare against |
-| roles | array | Roles to be assigned if the rule matches |
-
-
-## LivenessResponse
-
-
-Model representing a response to a liveness request.
-
-Attributes:
-    alive: If app is alive.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| alive | boolean | Flag indicating that the app is alive |
-
-
-## LlamaStackConfiguration
-
-
-Llama stack configuration.
-
-Llama Stack is a comprehensive system that provides a uniform set of tools
-for building, scaling, and deploying generative AI applications, enabling
-developers to create, integrate, and orchestrate multiple AI services and
-capabilities into an adaptable setup.
-
-Useful resources:
-
-  - [Llama Stack](https://www.llama.com/products/llama-stack/)
-  - [Python Llama Stack client](https://github.com/llamastack/llama-stack-client-python)
-  - [Build AI Applications with Llama Stack](https://llamastack.github.io/)
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| url |  | URL to Llama Stack service; used when library mode is disabled |
-| api_key |  | API key to access Llama Stack service |
-| use_as_library_client |  | When set to true Llama Stack will be used in library mode, not in server mode (default) |
-| library_client_config_path |  | Path to configuration file used when Llama Stack is run in library mode |
-
-
-## ModelContextProtocolServer
-
-
-Model context protocol server configuration.
-
-MCP (Model Context Protocol) servers provide tools and
-capabilities to the AI agents. These are configured by this structure.
-Only MCP servers defined in the lightspeed-stack.yaml configuration are
-available to the agents. Tools configured in the llama-stack run.yaml
-are not accessible to lightspeed-core agents.
-
-Useful resources:
-
-- [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro)
-- [MCP FAQs](https://modelcontextprotocol.io/faqs)
-- [Wikipedia article](https://en.wikipedia.org/wiki/Model_Context_Protocol)
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| name | string | MCP server name that must be unique |
-| provider_id | string | MCP provider identification |
-| url | string | URL of the MCP server |
-
-
-## ModelsResponse
-
-
-Model representing a response to models request.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| models | array | List of models available |
-
-
-## NotFoundResponse
-
-
-404 Not Found - Resource does not exist.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## PostgreSQLDatabaseConfiguration
-
-
-PostgreSQL database configuration.
-
-PostgreSQL database is used by Lightspeed Core Stack service for storing information about
-conversation IDs. It can also be leveraged to store conversation history and information
-about quota usage.
-
-Useful resources:
-
-- [Psycopg: connection classes](https://www.psycopg.org/psycopg3/docs/api/connections.html)
-- [PostgreSQL connection strings](https://www.connectionstrings.com/postgresql/)
-- [How to Use PostgreSQL in Python](https://www.freecodecamp.org/news/postgresql-in-python/)
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| host | string | Database server host or socket directory |
-| port | integer | Database server port |
-| db | string | Database name to connect to |
-| user | string | Database user name used to authenticate |
-| password | string | Password used to authenticate |
-| namespace |  | Database namespace |
-| ssl_mode | string | SSL mode |
-| gss_encmode | string | This option determines whether or with what priority a secure GSS TCP/IP connection will be negotiated with the server. |
-| ca_cert_path |  | Path to CA certificate |
-
-
-## ProviderHealthStatus
-
-
-Model representing the health status of a provider.
-
-Attributes:
-    provider_id: The ID of the provider.
-    status: The health status ('ok', 'unhealthy', 'not_implemented').
-    message: Optional message about the health status.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| provider_id | string | The ID of the provider |
-| status | string | The health status |
-| message |  | Optional message about the health status |
-
-
-## ProviderResponse
-
-
-Model representing a response to get specific provider request.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| api | string | The API this provider implements |
-| config | object | Provider configuration parameters |
-| health | object | Current health status of the provider |
-| provider_id | string | Unique provider identifier |
-| provider_type | string | Provider implementation type |
-
-
-## ProvidersListResponse
-
-
-Model representing a response to providers request.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| providers | object | List of available API types and their corresponding providers |
-
-
-## QueryRequest
-
-
-Model representing a request for the LLM (Language Model).
-
-Attributes:
-    query: The query string.
-    conversation_id: The optional conversation ID (UUID).
-    provider: The optional provider.
-    model: The optional model.
-    system_prompt: The optional system prompt.
-    attachments: The optional attachments.
-    no_tools: Whether to bypass all tools and MCP servers (default: False).
-    generate_topic_summary: Whether to generate topic summary for new conversations.
-    media_type: The optional media type for response format (application/json or text/plain).
-
-Example:
-    ```python
-    query_request = QueryRequest(query="Tell me about Kubernetes")
-    ```
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| query | string | The query string |
-| conversation_id |  | The optional conversation ID (UUID) |
-| provider |  | The optional provider |
-| model |  | The optional model |
-| system_prompt |  | The optional system prompt. |
-| attachments |  | The optional list of attachments. |
-| no_tools |  | Whether to bypass all tools and MCP servers |
-| generate_topic_summary |  | Whether to generate topic summary for new conversations |
-| media_type |  | Media type for the response format |
-
-
-## QueryResponse
-
-
-Model representing LLM response to a query.
-
-Attributes:
-    conversation_id: The optional conversation ID (UUID).
-    response: The response.
-    rag_chunks: List of RAG chunks used to generate the response.
-    referenced_documents: The URLs and titles for the documents used to generate the response.
-    tool_calls: List of tool calls made during response generation.
-    truncated: Whether conversation history was truncated.
-    input_tokens: Number of tokens sent to LLM.
-    output_tokens: Number of tokens received from LLM.
-    available_quotas: Quota available as measured by all configured quota limiters.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| conversation_id |  | The optional conversation ID (UUID) |
-| response | string | Response from LLM |
-| rag_chunks | array | List of RAG chunks used to generate the response |
-| tool_calls |  | List of tool calls made during response generation |
-| referenced_documents | array | List of documents referenced in generating the response |
-| truncated | boolean | Whether conversation history was truncated |
-| input_tokens | integer | Number of tokens sent to LLM |
-| output_tokens | integer | Number of tokens received from LLM |
-| available_quotas | object | Quota available as measured by all configured quota limiters |
-
-
-## QuotaExceededResponse
-
-
-429 Too Many Requests - Quota limit exceeded.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## QuotaHandlersConfiguration
-
-
-Quota limiter configuration.
-
-It is possible to limit quota usage per user or per service or services
-(that typically run in one cluster). Each limit is configured as a separate
-_quota limiter_. It can be of type `user_limiter` or `cluster_limiter`
-(which is name that makes sense in OpenShift deployment).
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| sqlite |  | SQLite database configuration |
-| postgres |  | PostgreSQL database configuration |
-| limiters | array | Quota limiters configuration |
-| scheduler |  | Quota scheduler configuration |
-| enable_token_history | boolean | Enables storing information about token usage history |
-
-
-## QuotaLimiterConfiguration
-
-
-Configuration for one quota limiter.
-
-There are three configuration options for each limiter:
-
-1. ``period`` is specified in a human-readable form, see
-   https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
-   for all possible options. When the end of the period is reached, the
-   quota is reset or increased.
-2. ``initial_quota`` is the value set at the beginning of the period.
-3. ``quota_increase`` is the value (if specified) used to increase the
-   quota when the period is reached.
-
-There are two basic use cases:
-
-1. When the quota needs to be reset to a specific value periodically (for
-   example on a weekly or monthly basis), set ``initial_quota`` to the
-   required value.
-2. When the quota needs to be increased by a specific value periodically
-   (for example on a daily basis), set ``quota_increase``.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| type | string | Quota limiter type, either user_limiter or cluster_limiter |
-| name | string | Human readable quota limiter name |
-| initial_quota | integer | Quota set at beginning of the period |
-| quota_increase | integer | Delta value used to increase quota when period is reached |
-| period | string | Period specified in human readable form |
-
-
-## QuotaSchedulerConfiguration
-
-
-Quota scheduler configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| period | integer | Quota scheduler period specified in seconds |
-
-
-## RAGChunk
-
-
-Model representing a RAG chunk used in the response.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| content | string | The content of the chunk |
-| source |  | Source document or URL |
-| score |  | Relevance score |
-
-
-## RAGInfoResponse
-
-
-Model representing a response with information about RAG DB.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| id | string | Vector DB unique ID |
-| name |  | Human readable vector DB name |
-| created_at | integer | When the vector store was created, represented as Unix time |
-| last_active_at |  | When the vector store was last active, represented as Unix time |
-| usage_bytes | integer | Storage byte(s) used by this vector DB |
-| expires_at |  | When the vector store expires, represented as Unix time |
-| object | string | Object type |
-| status | string | Vector DB status |
-
-
-## RAGListResponse
-
-
-Model representing a response to list RAGs request.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| rags | array | List of RAG identifiers |
-
-
-## RHIdentityConfiguration
-
-
-Red Hat Identity authentication configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| required_entitlements |  | List of all required entitlements. |
-
-
-## ReadinessResponse
-
-
-Model representing response to a readiness request.
-
-Attributes:
-    ready: If service is ready.
-    reason: The reason for the readiness.
-    providers: List of unhealthy providers in case of readiness failure.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| ready | boolean | Flag indicating if service is ready |
-| reason | string | The reason for the readiness |
-| providers | array | List of unhealthy providers in case of readiness failure. |
-
-
-## ReferencedDocument
-
-
-Model representing a document referenced in generating a response.
-
-Attributes:
-    doc_url: Url to the referenced doc.
-    doc_title: Title of the referenced doc.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| doc_url |  | URL of the referenced document |
-| doc_title |  | Title of the referenced document |
-
-
-## SQLiteDatabaseConfiguration
-
-
-SQLite database configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| db_path | string | Path to file where SQLite database is stored |
-
-
-## ServiceConfiguration
-
-
-Service configuration.
-
-Lightspeed Core Stack is a REST API service that accepts requests
-on a specified hostname and port. It is also possible to enable
-authentication and specify the number of Uvicorn workers. When more
-workers are specified, the service can handle requests concurrently.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| host | string | Service hostname |
-| port | integer | Service port |
-| auth_enabled | boolean | Enables the authentication subsystem |
-| workers | integer | Number of Uvicorn worker processes to start |
-| color_log | boolean | Enables colorized logging |
-| access_log | boolean | Enables logging of all access information |
-| tls_config |  | Transport Layer Security configuration for HTTPS support |
-| cors |  | Cross-Origin Resource Sharing configuration for cross-domain requests |
-
-
-## ServiceUnavailableResponse
-
-
-503 Backend Unavailable.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## ShieldsResponse
-
-
-Model representing a response to shields request.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| shields | array | List of shields available |
-
-
-## StatusResponse
-
-
-Model representing a response to a status request.
-
-Attributes:
-    functionality: The functionality of the service.
-    status: The status of the service.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| functionality | string | The functionality of the service |
-| status | object | The status of the service |
-
-
-## TLSConfiguration
-
-
-TLS configuration.
-
-Transport Layer Security (TLS) is a cryptographic protocol designed to
-provide communications security over a computer network, such as the
-Internet. The protocol is widely used in applications such as email,
-instant messaging, and voice over IP, but its use in securing HTTPS remains
-the most publicly visible.
-
-Useful resources:
-
-  - [FastAPI HTTPS Deployment](https://fastapi.tiangolo.com/deployment/https/)
-  - [Transport Layer Security Overview](https://en.wikipedia.org/wiki/Transport_Layer_Security)
-  - [What is TLS](https://www.ssltrust.eu/learning/ssl/transport-layer-security-tls)
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| tls_certificate_path |  | SSL/TLS certificate file path for HTTPS support. |
-| tls_key_path |  | SSL/TLS private key file path for HTTPS support. |
-| tls_key_password |  | Path to file containing the password to decrypt the SSL/TLS private key. |
-
-
-## ToolCall
-
-
-Model representing a tool call made during response generation.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| tool_name | string | Name of the tool called |
-| arguments | object | Arguments passed to the tool |
-| result |  | Result from the tool |
-
-
-## ToolsResponse
-
-
-Model representing a response to tools request.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| tools | array | List of tools available from all configured MCP servers and built-in toolgroups |
-
-
-## UnauthorizedResponse
-
-
-401 Unauthorized - Missing or invalid credentials.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## UnprocessableEntityResponse
-
-
-422 Unprocessable Entity - Request validation failed.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| status_code | integer |  |
-| detail |  |  |
-
-
-## UserDataCollection
-
-
-User data collection configuration.
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| feedback_enabled | boolean | When set to true the user feedback is stored and later sent for analysis. |
-| feedback_storage |  | Path to directory where feedback will be saved for further processing. |
-| transcripts_enabled | boolean | When set to true the conversation history is stored and later sent for analysis. |
-| transcripts_storage |  | Path to directory where conversation history will be saved for further processing. |
-
-
-## ValidationError
-
-
-
-| Field | Type | Description |
-|-------|------|-------------|
-| loc | array |  |
-| msg | string |  |
-| type | string |  |