diff --git a/docs/openapi.md b/docs/openapi.md index f32545e3a..1dcc57cab 100644 --- a/docs/openapi.md +++ b/docs/openapi.md @@ -1020,12 +1020,12 @@ Examples | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | ## POST `/v1/query` -> **Query Endpoint Handler** +> **Query Endpoint Handler V1** -Handle request to the /query endpoint using Agent API. +Handle request to the /query endpoint using Responses API. This is a wrapper around query_endpoint_handler_base that provides -the Agent API specific retrieve_response and get_topic_summary functions. +the Responses API specific retrieve_response and get_topic_summary functions. Returns: QueryResponse: Contains the conversation ID and the LLM-generated response. @@ -1320,20 +1320,27 @@ Examples | ## POST `/v1/streaming_query` -> **Streaming Query Endpoint Handler** +> **Streaming Query Endpoint Handler V1** -Handle request to the /streaming_query endpoint using Agent API. +Handle request to the /streaming_query endpoint using Responses API. -This is a wrapper around streaming_query_endpoint_handler_base that provides -the Agent API specific retrieve_response and response generator functions. +Returns a streaming response using Server-Sent Events (SSE) format with +content type text/event-stream. Returns: StreamingResponse: An HTTP streaming response yielding - SSE-formatted events for the query lifecycle. + SSE-formatted events for the query lifecycle with content type + text/event-stream. Raises: - HTTPException: Returns HTTP 500 if unable to connect to the - Llama Stack server. + HTTPException: + - 401: Unauthorized - Missing or invalid credentials + - 403: Forbidden - Insufficient permissions or model override not allowed + - 404: Not Found - Conversation, model, or provider not found + - 422: Unprocessable Entity - Request validation failed + - 429: Too Many Requests - Quota limit exceeded + - 500: Internal Server Error - Configuration not loaded or other server errors + - 503: Service Unavailable - Unable to connect to Llama Stack backend @@ -1347,7 +1354,7 @@ Raises: | Status Code | Description | Component | |-------------|-------------|-----------| -| 200 | Streaming response (Server-Sent Events) | ...string | +| 200 | Successful response | string | | 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) Examples @@ -1960,7 +1967,7 @@ Examples | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | ## GET `/v1/conversations` -> **Get Conversations List Endpoint Handler** +> **Conversations List Endpoint Handler V1** Handle request to retrieve all conversations for the authenticated user. @@ -2067,23 +2074,25 @@ Examples | ## GET `/v1/conversations/{conversation_id}` -> **Get Conversation Endpoint Handler** +> **Conversation Get Endpoint Handler V1** -Handle request to retrieve a conversation by ID. +Handle request to retrieve a conversation by ID using Conversations API. -Retrieve a conversation's chat history by its ID. Then fetches -the conversation session from the Llama Stack backend, -simplifies the session data to essential chat history, and -returns it in a structured response. Raises HTTP 400 for -invalid IDs, 404 if not found, 503 if the backend is -unavailable, and 500 for unexpected errors. +Retrieve a conversation's chat history by its ID using the LlamaStack +Conversations API. This endpoint fetches the conversation items from +the backend, simplifies them to essential chat history, and returns +them in a structured response. Raises HTTP 400 for invalid IDs, 404 +if not found, 503 if the backend is unavailable, and 500 for +unexpected errors. -Parameters: - conversation_id (str): Unique identifier of the conversation to retrieve. +Args: + request: The FastAPI request object + conversation_id: Unique identifier of the conversation to retrieve + auth: Authentication tuple from dependency Returns: ConversationResponse: Structured response containing the conversation - ID and simplified chat history. + ID and simplified chat history @@ -2240,17 +2249,22 @@ Examples | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | ## DELETE `/v1/conversations/{conversation_id}` -> **Delete Conversation Endpoint Handler** +> **Conversation Delete Endpoint Handler V1** -Handle request to delete a conversation by ID. +Handle request to delete a conversation by ID using Conversations API. Validates the conversation ID format and attempts to delete the -corresponding session from the Llama Stack backend. Raises HTTP -errors for invalid IDs, not found conversations, connection +conversation from the Llama Stack backend using the Conversations API. +Raises HTTP errors for invalid IDs, not found conversations, connection issues, or unexpected failures. +Args: + request: The FastAPI request object + conversation_id: Unique identifier of the conversation to delete + auth: Authentication tuple from dependency + Returns: - ConversationDeleteResponse: Response indicating the result of the deletion operation. + ConversationDeleteResponse: Response indicating the result of the deletion operation @@ -2358,6 +2372,152 @@ Examples +```json +{ + "detail": { + "cause": "User 6789 is not authorized to access this endpoint.", + "response": "User does not have permission to access this endpoint" + } +} +``` + | +| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "Lightspeed Stack configuration has not been initialized.", + "response": "Configuration is not loaded" + } +} +``` + + + + +```json +{ + "detail": { + "cause": "Failed to query the database", + "response": "Database query failed" + } +} +``` + | +| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "Connection error while trying to reach backend service.", + "response": "Unable to connect to Llama Stack" + } +} +``` + | +| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | +## PUT `/v1/conversations/{conversation_id}` + +> **Conversation Update Endpoint Handler V1** + +Handle request to update a conversation metadata using Conversations API. + +Updates the conversation metadata (including topic summary) in both the +LlamaStack backend using the Conversations API and the local database. + +Args: + request: The FastAPI request object + conversation_id: Unique identifier of the conversation to update + update_request: Request containing the topic summary to update + auth: Authentication tuple from dependency + +Returns: + ConversationUpdateResponse: Response indicating the result of the update operation + + + +### 🔗 Parameters + +| Name | Type | Required | Description | +|------|------|----------|-------------| +| conversation_id | string | True | | + + +### 📦 Request Body + +[ConversationUpdateRequest](#conversationupdaterequest) + +### ✅ Responses + +| Status Code | Description | Component | +|-------------|-------------|-----------| +| 200 | Successful response | [ConversationUpdateResponse](#conversationupdateresponse) | +| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.", + "response": "Invalid conversation ID format" + } +} +``` + | +| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "No Authorization header found", + "response": "Missing or invalid credentials provided by client" + } +} +``` + + + + +```json +{ + "detail": { + "cause": "No token found in Authorization header", + "response": "Missing or invalid credentials provided by client" + } +} +``` + | +| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) + +Examples + + + + + ```json { "detail": { @@ -2758,23 +2918,6 @@ Examples "response": "User does not have permission to access this endpoint" } } -``` - | -| 404 | Resource not found | [NotFoundResponse](#notfoundresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist", - "response": "Conversation not found" - } -} ``` | | 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) @@ -2941,31 +3084,25 @@ Examples ``` | | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | -## POST `/v2/query` - -> **Query Endpoint Handler V2** - -Handle request to the /query endpoint using Responses API. - -This is a wrapper around query_endpoint_handler_base that provides -the Responses API specific retrieve_response and get_topic_summary functions. +## GET `/readiness` -Returns: - QueryResponse: Contains the conversation ID and the LLM-generated response. +> **Readiness Probe Get Method** +Handle the readiness probe endpoint, returning service readiness. +If any provider reports an error status, responds with HTTP 503 +and details of unhealthy providers; otherwise, indicates the +service is ready. -### 📦 Request Body -[QueryRequest](#queryrequest) ### ✅ Responses | Status Code | Description | Component | |-------------|-------------|-----------| -| 200 | Successful response | [QueryResponse](#queryresponse) | +| 200 | Successful response | [ReadinessResponse](#readinessresponse) | | 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) Examples @@ -3006,11 +3143,16 @@ Examples ```json { "detail": { - "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000", - "response": "User does not have permission to perform this action" + "cause": "User 6789 is not authorized to access this endpoint.", + "response": "User does not have permission to access this endpoint" } } ``` + | +| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) + +Examples + @@ -3018,40 +3160,34 @@ Examples ```json { "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" + "cause": "Connection error while trying to reach backend service.", + "response": "Unable to connect to Llama Stack" } } ``` + | +## GET `/liveness` +> **Liveness Probe Get Method** +Return the liveness status of the service. +Returns: + LivenessResponse: Indicates that the service is alive. -```json -{ - "detail": { - "cause": "User lacks model_override permission required to override model/provider.", - "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request." - } -} -``` - | -| 404 | Resource not found | [NotFoundResponse](#notfoundresponse) -Examples +### ✅ Responses +| Status Code | Description | Component | +|-------------|-------------|-----------| +| 200 | Successful response | [LivenessResponse](#livenessresponse) | +| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) + +Examples -```json -{ - "detail": { - "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist", - "response": "Conversation not found" - } -} -``` @@ -3059,8 +3195,8 @@ Examples ```json { "detail": { - "cause": "Provider with ID openai does not exist", - "response": "Provider not found" + "cause": "No Authorization header found", + "response": "Missing or invalid credentials provided by client" } } ``` @@ -3071,13 +3207,13 @@ Examples ```json { "detail": { - "cause": "Model with ID gpt-4-turbo is not configured", - "response": "Model not found" + "cause": "No token found in Authorization header", + "response": "Missing or invalid credentials provided by client" } } ``` | -| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse) +| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) Examples @@ -3088,619 +3224,20 @@ Examples ```json { "detail": { - "cause": "Invalid request format. The request body could not be parsed.", - "response": "Invalid request format" + "cause": "User 6789 is not authorized to access this endpoint.", + "response": "User does not have permission to access this endpoint" } } ``` + | +## POST `/authorized` +> **Authorized Endpoint Handler** +Handle request to the /authorized endpoint. - -```json -{ - "detail": { - "cause": "Missing required attributes: ['query', 'model', 'provider']", - "response": "Missing required attributes" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']", - "response": "Invalid attribute value" - } -} -``` - | -| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "The token quota for model gpt-4-turbo has been exceeded.", - "response": "The model quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has 5 tokens, but 10 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has 500 tokens, but 900 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - | -| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Lightspeed Stack configuration has not been initialized.", - "response": "Configuration is not loaded" - } -} -``` - | -| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Connection error while trying to reach backend service.", - "response": "Unable to connect to Llama Stack" - } -} -``` - | -## POST `/v2/streaming_query` - -> **Streaming Query Endpoint Handler V2** - -Handle request to the /streaming_query endpoint using Responses API. - -This is a wrapper around streaming_query_endpoint_handler_base that provides -the Responses API specific retrieve_response and response generator functions. - -Returns: - StreamingResponse: An HTTP streaming response yielding - SSE-formatted events for the query lifecycle. - -Raises: - HTTPException: Returns HTTP 500 if unable to connect to the - Llama Stack server. - - - - - -### 📦 Request Body - -[QueryRequest](#queryrequest) - -### ✅ Responses - -| Status Code | Description | Component | -|-------------|-------------|-----------| -| 200 | Streaming response with Server-Sent Events | string -string | -| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "No Authorization header found", - "response": "Missing or invalid credentials provided by client" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "No token found in Authorization header", - "response": "Missing or invalid credentials provided by client" - } -} -``` - | -| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000", - "response": "User does not have permission to perform this action" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User lacks model_override permission required to override model/provider.", - "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request." - } -} -``` - | -| 404 | Resource not found | [NotFoundResponse](#notfoundresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist", - "response": "Conversation not found" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Provider with ID openai does not exist", - "response": "Provider not found" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Model with ID gpt-4-turbo is not configured", - "response": "Model not found" - } -} -``` - | -| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Invalid request format. The request body could not be parsed.", - "response": "Invalid request format" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Missing required attributes: ['query', 'model', 'provider']", - "response": "Missing required attributes" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']", - "response": "Invalid attribute value" - } -} -``` - | -| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "The token quota for model gpt-4-turbo has been exceeded.", - "response": "The model quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has 5 tokens, but 10 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has 500 tokens, but 900 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - | -| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Lightspeed Stack configuration has not been initialized.", - "response": "Configuration is not loaded" - } -} -``` - | -| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Connection error while trying to reach backend service.", - "response": "Unable to connect to Llama Stack" - } -} -``` - | -## GET `/readiness` - -> **Readiness Probe Get Method** - -Handle the readiness probe endpoint, returning service readiness. - -If any provider reports an error status, responds with HTTP 503 -and details of unhealthy providers; otherwise, indicates the -service is ready. - - - - - -### ✅ Responses - -| Status Code | Description | Component | -|-------------|-------------|-----------| -| 200 | Successful response | [ReadinessResponse](#readinessresponse) | -| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "No Authorization header found", - "response": "Missing or invalid credentials provided by client" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "No token found in Authorization header", - "response": "Missing or invalid credentials provided by client" - } -} -``` - | -| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" - } -} -``` - | -| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Connection error while trying to reach backend service.", - "response": "Unable to connect to Llama Stack" - } -} -``` - | -## GET `/liveness` - -> **Liveness Probe Get Method** - -Return the liveness status of the service. - -Returns: - LivenessResponse: Indicates that the service is alive. - - - - - -### ✅ Responses - -| Status Code | Description | Component | -|-------------|-------------|-----------| -| 200 | Successful response | [LivenessResponse](#livenessresponse) | -| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "No Authorization header found", - "response": "Missing or invalid credentials provided by client" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "No token found in Authorization header", - "response": "Missing or invalid credentials provided by client" - } -} -``` - | -| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" - } -} -``` - | -## POST `/authorized` - -> **Authorized Endpoint Handler** - -Handle request to the /authorized endpoint. - -Process POST requests to the /authorized endpoint, returning -the authenticated user's ID and username. +Process POST requests to the /authorized endpoint, returning +the authenticated user's ID and username. Returns: AuthorizedResponse: Contains the user ID and username of the authenticated user. @@ -3868,6 +3405,17 @@ Examples +## APIKeyTokenConfiguration + + +API Key Token configuration. + + +| Field | Type | Description | +|-------|------|-------------| +| api_key | string | | + + ## AccessRule @@ -3931,6 +3479,7 @@ Authentication configuration. | k8s_cluster_api | | | | k8s_ca_cert_path | | | | jwk_config | | | +| api_key_config | | | | rh_identity_config | | | @@ -4546,11 +4095,11 @@ Useful resources: Model context protocol server configuration. -MCP (Model Context Protocol) servers provide tools and -capabilities to the AI agents. These are configured by this structure. -Only MCP servers defined in the lightspeed-stack.yaml configuration are -available to the agents. Tools configured in the llama-stack run.yaml -are not accessible to lightspeed-core agents. +MCP (Model Context Protocol) servers provide tools and capabilities to the +AI agents. These are configured by this structure. Only MCP servers +defined in the lightspeed-stack.yaml configuration are available to the +agents. Tools configured in the llama-stack run.yaml are not accessible to +lightspeed-core agents. Useful resources: @@ -4594,9 +4143,9 @@ Model representing a response to models request. PostgreSQL database configuration. -PostgreSQL database is used by Lightspeed Core Stack service for storing information about -conversation IDs. It can also be leveraged to store conversation history and information -about quota usage. +PostgreSQL database is used by Lightspeed Core Stack service for storing +information about conversation IDs. It can also be leveraged to store +conversation history and information about quota usage. Useful resources: @@ -4718,13 +4267,13 @@ Attributes: |-------|------|-------------| | conversation_id | | The optional conversation ID (UUID) | | response | string | Response from LLM | -| rag_chunks | array | List of RAG chunks used to generate the response | -| tool_calls | | List of tool calls made during response generation | | referenced_documents | array | List of documents referenced in generating the response | | truncated | boolean | Whether conversation history was truncated | | input_tokens | integer | Number of tokens sent to LLM | | output_tokens | integer | Number of tokens received from LLM | | available_quotas | object | Quota available as measured by all configured quota limiters | +| tool_calls | | List of tool calls made during response generation | +| tool_results | | List of tool results | ## QuotaExceededResponse @@ -4803,19 +4352,6 @@ Quota scheduler configuration. | period | integer | Quota scheduler period specified in seconds | -## RAGChunk - - -Model representing a RAG chunk used in the response. - - -| Field | Type | Description | -|-------|------|-------------| -| content | string | The content of the chunk | -| source | | Source document or URL | -| score | | Relevance score | - - ## RAGInfoResponse @@ -4906,10 +4442,10 @@ SQLite database configuration. Service configuration. -Lightspeed Core Stack is a REST API service that accepts requests -on a specified hostname and port. It is also possible to enable -authentication and specify the number of Uvicorn workers. When more -workers are specified, the service can handle requests concurrently. +Lightspeed Core Stack is a REST API service that accepts requests on a +specified hostname and port. It is also possible to enable authentication +and specify the number of Uvicorn workers. When more workers are specified, +the service can handle requests concurrently. | Field | Type | Description | @@ -4988,17 +4524,33 @@ Useful resources: | tls_key_password | | Path to file containing the password to decrypt the SSL/TLS private key. | -## ToolCall +## ToolCallSummary + + +Model representing a tool call made during response generation (for tool_calls list). + + +| Field | Type | Description | +|-------|------|-------------| +| id | string | ID of the tool call | +| name | string | Name of the tool called | +| args | object | Arguments passed to the tool | +| type | string | Type indicator for tool call | + + +## ToolResultSummary -Model representing a tool call made during response generation. +Model representing a result from a tool call (for tool_results list). | Field | Type | Description | |-------|------|-------------| -| tool_name | string | Name of the tool called | -| arguments | object | Arguments passed to the tool | -| result | | Result from the tool | +| id | string | ID of the tool call/result, matches the corresponding tool call 'id' | +| status | string | Status of the tool execution (e.g., 'success') | +| content | | Content/result returned from the tool | +| type | string | Type indicator for tool result | +| round | integer | Round number or step of tool execution | ## ToolsResponse diff --git a/docs/output.md b/docs/output.md index 5b87f2ef3..1dcc57cab 100644 --- a/docs/output.md +++ b/docs/output.md @@ -1020,12 +1020,12 @@ Examples | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | ## POST `/v1/query` -> **Query Endpoint Handler** +> **Query Endpoint Handler V1** -Handle request to the /query endpoint using Agent API. +Handle request to the /query endpoint using Responses API. This is a wrapper around query_endpoint_handler_base that provides -the Agent API specific retrieve_response and get_topic_summary functions. +the Responses API specific retrieve_response and get_topic_summary functions. Returns: QueryResponse: Contains the conversation ID and the LLM-generated response. @@ -1320,20 +1320,27 @@ Examples | ## POST `/v1/streaming_query` -> **Streaming Query Endpoint Handler** +> **Streaming Query Endpoint Handler V1** -Handle request to the /streaming_query endpoint using Agent API. +Handle request to the /streaming_query endpoint using Responses API. -This is a wrapper around streaming_query_endpoint_handler_base that provides -the Agent API specific retrieve_response and response generator functions. +Returns a streaming response using Server-Sent Events (SSE) format with +content type text/event-stream. Returns: StreamingResponse: An HTTP streaming response yielding - SSE-formatted events for the query lifecycle. + SSE-formatted events for the query lifecycle with content type + text/event-stream. Raises: - HTTPException: Returns HTTP 500 if unable to connect to the - Llama Stack server. + HTTPException: + - 401: Unauthorized - Missing or invalid credentials + - 403: Forbidden - Insufficient permissions or model override not allowed + - 404: Not Found - Conversation, model, or provider not found + - 422: Unprocessable Entity - Request validation failed + - 429: Too Many Requests - Quota limit exceeded + - 500: Internal Server Error - Configuration not loaded or other server errors + - 503: Service Unavailable - Unable to connect to Llama Stack backend @@ -1347,7 +1354,7 @@ Raises: | Status Code | Description | Component | |-------------|-------------|-----------| -| 200 | Streaming response (Server-Sent Events) | ...string | +| 200 | Successful response | string | | 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) Examples @@ -1960,7 +1967,7 @@ Examples | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | ## GET `/v1/conversations` -> **Get Conversations List Endpoint Handler** +> **Conversations List Endpoint Handler V1** Handle request to retrieve all conversations for the authenticated user. @@ -2067,23 +2074,25 @@ Examples | ## GET `/v1/conversations/{conversation_id}` -> **Get Conversation Endpoint Handler** +> **Conversation Get Endpoint Handler V1** -Handle request to retrieve a conversation by ID. +Handle request to retrieve a conversation by ID using Conversations API. -Retrieve a conversation's chat history by its ID. Then fetches -the conversation session from the Llama Stack backend, -simplifies the session data to essential chat history, and -returns it in a structured response. Raises HTTP 400 for -invalid IDs, 404 if not found, 503 if the backend is -unavailable, and 500 for unexpected errors. +Retrieve a conversation's chat history by its ID using the LlamaStack +Conversations API. This endpoint fetches the conversation items from +the backend, simplifies them to essential chat history, and returns +them in a structured response. Raises HTTP 400 for invalid IDs, 404 +if not found, 503 if the backend is unavailable, and 500 for +unexpected errors. -Parameters: - conversation_id (str): Unique identifier of the conversation to retrieve. +Args: + request: The FastAPI request object + conversation_id: Unique identifier of the conversation to retrieve + auth: Authentication tuple from dependency Returns: ConversationResponse: Structured response containing the conversation - ID and simplified chat history. + ID and simplified chat history @@ -2240,17 +2249,22 @@ Examples | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | ## DELETE `/v1/conversations/{conversation_id}` -> **Delete Conversation Endpoint Handler** +> **Conversation Delete Endpoint Handler V1** -Handle request to delete a conversation by ID. +Handle request to delete a conversation by ID using Conversations API. Validates the conversation ID format and attempts to delete the -corresponding session from the Llama Stack backend. Raises HTTP -errors for invalid IDs, not found conversations, connection +conversation from the Llama Stack backend using the Conversations API. +Raises HTTP errors for invalid IDs, not found conversations, connection issues, or unexpected failures. +Args: + request: The FastAPI request object + conversation_id: Unique identifier of the conversation to delete + auth: Authentication tuple from dependency + Returns: - ConversationDeleteResponse: Response indicating the result of the deletion operation. + ConversationDeleteResponse: Response indicating the result of the deletion operation @@ -2358,6 +2372,152 @@ Examples +```json +{ + "detail": { + "cause": "User 6789 is not authorized to access this endpoint.", + "response": "User does not have permission to access this endpoint" + } +} +``` + | +| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "Lightspeed Stack configuration has not been initialized.", + "response": "Configuration is not loaded" + } +} +``` + + + + +```json +{ + "detail": { + "cause": "Failed to query the database", + "response": "Database query failed" + } +} +``` + | +| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "Connection error while trying to reach backend service.", + "response": "Unable to connect to Llama Stack" + } +} +``` + | +| 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | +## PUT `/v1/conversations/{conversation_id}` + +> **Conversation Update Endpoint Handler V1** + +Handle request to update a conversation metadata using Conversations API. + +Updates the conversation metadata (including topic summary) in both the +LlamaStack backend using the Conversations API and the local database. + +Args: + request: The FastAPI request object + conversation_id: Unique identifier of the conversation to update + update_request: Request containing the topic summary to update + auth: Authentication tuple from dependency + +Returns: + ConversationUpdateResponse: Response indicating the result of the update operation + + + +### 🔗 Parameters + +| Name | Type | Required | Description | +|------|------|----------|-------------| +| conversation_id | string | True | | + + +### 📦 Request Body + +[ConversationUpdateRequest](#conversationupdaterequest) + +### ✅ Responses + +| Status Code | Description | Component | +|-------------|-------------|-----------| +| 200 | Successful response | [ConversationUpdateResponse](#conversationupdateresponse) | +| 400 | Invalid request format | [BadRequestResponse](#badrequestresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "The conversation ID 123e4567-e89b-12d3-a456-426614174000 has invalid format.", + "response": "Invalid conversation ID format" + } +} +``` + | +| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) + +Examples + + + + + +```json +{ + "detail": { + "cause": "No Authorization header found", + "response": "Missing or invalid credentials provided by client" + } +} +``` + + + + +```json +{ + "detail": { + "cause": "No token found in Authorization header", + "response": "Missing or invalid credentials provided by client" + } +} +``` + | +| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) + +Examples + + + + + ```json { "detail": { @@ -2758,23 +2918,6 @@ Examples "response": "User does not have permission to access this endpoint" } } -``` - | -| 404 | Resource not found | [NotFoundResponse](#notfoundresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist", - "response": "Conversation not found" - } -} ``` | | 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) @@ -2941,31 +3084,25 @@ Examples ``` | | 422 | Validation Error | [HTTPValidationError](#httpvalidationerror) | -## POST `/v2/query` - -> **Query Endpoint Handler V2** - -Handle request to the /query endpoint using Responses API. - -This is a wrapper around query_endpoint_handler_base that provides -the Responses API specific retrieve_response and get_topic_summary functions. +## GET `/readiness` -Returns: - QueryResponse: Contains the conversation ID and the LLM-generated response. +> **Readiness Probe Get Method** +Handle the readiness probe endpoint, returning service readiness. +If any provider reports an error status, responds with HTTP 503 +and details of unhealthy providers; otherwise, indicates the +service is ready. -### 📦 Request Body -[QueryRequest](#queryrequest) ### ✅ Responses | Status Code | Description | Component | |-------------|-------------|-----------| -| 200 | Successful response | [QueryResponse](#queryresponse) | +| 200 | Successful response | [ReadinessResponse](#readinessresponse) | | 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) Examples @@ -3006,11 +3143,16 @@ Examples ```json { "detail": { - "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000", - "response": "User does not have permission to perform this action" + "cause": "User 6789 is not authorized to access this endpoint.", + "response": "User does not have permission to access this endpoint" } } ``` + | +| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) + +Examples + @@ -3018,40 +3160,34 @@ Examples ```json { "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" + "cause": "Connection error while trying to reach backend service.", + "response": "Unable to connect to Llama Stack" } } ``` + | +## GET `/liveness` +> **Liveness Probe Get Method** +Return the liveness status of the service. +Returns: + LivenessResponse: Indicates that the service is alive. -```json -{ - "detail": { - "cause": "User lacks model_override permission required to override model/provider.", - "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request." - } -} -``` - | -| 404 | Resource not found | [NotFoundResponse](#notfoundresponse) -Examples +### ✅ Responses +| Status Code | Description | Component | +|-------------|-------------|-----------| +| 200 | Successful response | [LivenessResponse](#livenessresponse) | +| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) + +Examples -```json -{ - "detail": { - "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist", - "response": "Conversation not found" - } -} -``` @@ -3059,8 +3195,8 @@ Examples ```json { "detail": { - "cause": "Provider with ID openai does not exist", - "response": "Provider not found" + "cause": "No Authorization header found", + "response": "Missing or invalid credentials provided by client" } } ``` @@ -3071,13 +3207,13 @@ Examples ```json { "detail": { - "cause": "Model with ID gpt-4-turbo is not configured", - "response": "Model not found" + "cause": "No token found in Authorization header", + "response": "Missing or invalid credentials provided by client" } } ``` | -| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse) +| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) Examples @@ -3088,619 +3224,20 @@ Examples ```json { "detail": { - "cause": "Invalid request format. The request body could not be parsed.", - "response": "Invalid request format" + "cause": "User 6789 is not authorized to access this endpoint.", + "response": "User does not have permission to access this endpoint" } } ``` + | +## POST `/authorized` +> **Authorized Endpoint Handler** +Handle request to the /authorized endpoint. - -```json -{ - "detail": { - "cause": "Missing required attributes: ['query', 'model', 'provider']", - "response": "Missing required attributes" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']", - "response": "Invalid attribute value" - } -} -``` - | -| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "The token quota for model gpt-4-turbo has been exceeded.", - "response": "The model quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has 5 tokens, but 10 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has 500 tokens, but 900 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - | -| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Lightspeed Stack configuration has not been initialized.", - "response": "Configuration is not loaded" - } -} -``` - | -| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Connection error while trying to reach backend service.", - "response": "Unable to connect to Llama Stack" - } -} -``` - | -## POST `/v2/streaming_query` - -> **Streaming Query Endpoint Handler V2** - -Handle request to the /streaming_query endpoint using Responses API. - -This is a wrapper around streaming_query_endpoint_handler_base that provides -the Responses API specific retrieve_response and response generator functions. - -Returns: - StreamingResponse: An HTTP streaming response yielding - SSE-formatted events for the query lifecycle. - -Raises: - HTTPException: Returns HTTP 500 if unable to connect to the - Llama Stack server. - - - - - -### 📦 Request Body - -[QueryRequest](#queryrequest) - -### ✅ Responses - -| Status Code | Description | Component | -|-------------|-------------|-----------| -| 200 | Streaming response with Server-Sent Events | string -string | -| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "No Authorization header found", - "response": "Missing or invalid credentials provided by client" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "No token found in Authorization header", - "response": "Missing or invalid credentials provided by client" - } -} -``` - | -| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000", - "response": "User does not have permission to perform this action" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User lacks model_override permission required to override model/provider.", - "response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request." - } -} -``` - | -| 404 | Resource not found | [NotFoundResponse](#notfoundresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist", - "response": "Conversation not found" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Provider with ID openai does not exist", - "response": "Provider not found" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Model with ID gpt-4-turbo is not configured", - "response": "Model not found" - } -} -``` - | -| 422 | Request validation failed | [UnprocessableEntityResponse](#unprocessableentityresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Invalid request format. The request body could not be parsed.", - "response": "Invalid request format" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Missing required attributes: ['query', 'model', 'provider']", - "response": "Missing required attributes" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']", - "response": "Invalid attribute value" - } -} -``` - | -| 429 | Quota limit exceeded | [QuotaExceededResponse](#quotaexceededresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "The token quota for model gpt-4-turbo has been exceeded.", - "response": "The model quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has no available tokens.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "User 123 has 5 tokens, but 10 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Cluster has 500 tokens, but 900 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.", - "response": "The quota has been exceeded" - } -} -``` - | -| 500 | Internal server error | [InternalServerErrorResponse](#internalservererrorresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Lightspeed Stack configuration has not been initialized.", - "response": "Configuration is not loaded" - } -} -``` - | -| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Connection error while trying to reach backend service.", - "response": "Unable to connect to Llama Stack" - } -} -``` - | -## GET `/readiness` - -> **Readiness Probe Get Method** - -Handle the readiness probe endpoint, returning service readiness. - -If any provider reports an error status, responds with HTTP 503 -and details of unhealthy providers; otherwise, indicates the -service is ready. - - - - - -### ✅ Responses - -| Status Code | Description | Component | -|-------------|-------------|-----------| -| 200 | Successful response | [ReadinessResponse](#readinessresponse) | -| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "No Authorization header found", - "response": "Missing or invalid credentials provided by client" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "No token found in Authorization header", - "response": "Missing or invalid credentials provided by client" - } -} -``` - | -| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" - } -} -``` - | -| 503 | Service unavailable | [ServiceUnavailableResponse](#serviceunavailableresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "Connection error while trying to reach backend service.", - "response": "Unable to connect to Llama Stack" - } -} -``` - | -## GET `/liveness` - -> **Liveness Probe Get Method** - -Return the liveness status of the service. - -Returns: - LivenessResponse: Indicates that the service is alive. - - - - - -### ✅ Responses - -| Status Code | Description | Component | -|-------------|-------------|-----------| -| 200 | Successful response | [LivenessResponse](#livenessresponse) | -| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "No Authorization header found", - "response": "Missing or invalid credentials provided by client" - } -} -``` - - - - -```json -{ - "detail": { - "cause": "No token found in Authorization header", - "response": "Missing or invalid credentials provided by client" - } -} -``` - | -| 403 | Permission denied | [ForbiddenResponse](#forbiddenresponse) - -Examples - - - - - -```json -{ - "detail": { - "cause": "User 6789 is not authorized to access this endpoint.", - "response": "User does not have permission to access this endpoint" - } -} -``` - | -## POST `/authorized` - -> **Authorized Endpoint Handler** - -Handle request to the /authorized endpoint. - -Process POST requests to the /authorized endpoint, returning -the authenticated user's ID and username. +Process POST requests to the /authorized endpoint, returning +the authenticated user's ID and username. Returns: AuthorizedResponse: Contains the user ID and username of the authenticated user. @@ -4558,11 +4095,11 @@ Useful resources: Model context protocol server configuration. -MCP (Model Context Protocol) servers provide tools and -capabilities to the AI agents. These are configured by this structure. -Only MCP servers defined in the lightspeed-stack.yaml configuration are -available to the agents. Tools configured in the llama-stack run.yaml -are not accessible to lightspeed-core agents. +MCP (Model Context Protocol) servers provide tools and capabilities to the +AI agents. These are configured by this structure. Only MCP servers +defined in the lightspeed-stack.yaml configuration are available to the +agents. Tools configured in the llama-stack run.yaml are not accessible to +lightspeed-core agents. Useful resources: @@ -4606,9 +4143,9 @@ Model representing a response to models request. PostgreSQL database configuration. -PostgreSQL database is used by Lightspeed Core Stack service for storing information about -conversation IDs. It can also be leveraged to store conversation history and information -about quota usage. +PostgreSQL database is used by Lightspeed Core Stack service for storing +information about conversation IDs. It can also be leveraged to store +conversation history and information about quota usage. Useful resources: @@ -4730,13 +4267,13 @@ Attributes: |-------|------|-------------| | conversation_id | | The optional conversation ID (UUID) | | response | string | Response from LLM | -| rag_chunks | array | List of RAG chunks used to generate the response | -| tool_calls | | List of tool calls made during response generation | | referenced_documents | array | List of documents referenced in generating the response | | truncated | boolean | Whether conversation history was truncated | | input_tokens | integer | Number of tokens sent to LLM | | output_tokens | integer | Number of tokens received from LLM | | available_quotas | object | Quota available as measured by all configured quota limiters | +| tool_calls | | List of tool calls made during response generation | +| tool_results | | List of tool results | ## QuotaExceededResponse @@ -4815,19 +4352,6 @@ Quota scheduler configuration. | period | integer | Quota scheduler period specified in seconds | -## RAGChunk - - -Model representing a RAG chunk used in the response. - - -| Field | Type | Description | -|-------|------|-------------| -| content | string | The content of the chunk | -| source | | Source document or URL | -| score | | Relevance score | - - ## RAGInfoResponse @@ -4918,10 +4442,10 @@ SQLite database configuration. Service configuration. -Lightspeed Core Stack is a REST API service that accepts requests -on a specified hostname and port. It is also possible to enable -authentication and specify the number of Uvicorn workers. When more -workers are specified, the service can handle requests concurrently. +Lightspeed Core Stack is a REST API service that accepts requests on a +specified hostname and port. It is also possible to enable authentication +and specify the number of Uvicorn workers. When more workers are specified, +the service can handle requests concurrently. | Field | Type | Description | @@ -5000,17 +4524,33 @@ Useful resources: | tls_key_password | | Path to file containing the password to decrypt the SSL/TLS private key. | -## ToolCall +## ToolCallSummary + + +Model representing a tool call made during response generation (for tool_calls list). + + +| Field | Type | Description | +|-------|------|-------------| +| id | string | ID of the tool call | +| name | string | Name of the tool called | +| args | object | Arguments passed to the tool | +| type | string | Type indicator for tool call | + + +## ToolResultSummary -Model representing a tool call made during response generation. +Model representing a result from a tool call (for tool_results list). | Field | Type | Description | |-------|------|-------------| -| tool_name | string | Name of the tool called | -| arguments | object | Arguments passed to the tool | -| result | | Result from the tool | +| id | string | ID of the tool call/result, matches the corresponding tool call 'id' | +| status | string | Status of the tool execution (e.g., 'success') | +| content | | Content/result returned from the tool | +| type | string | Type indicator for tool result | +| round | integer | Round number or step of tool execution | ## ToolsResponse