Skip to content

Conversation

@onmete
Copy link
Contributor

@onmete onmete commented Dec 9, 2025

Description

Fix api docs

  • Fix make target for openapi-doc
  • Remove output.md

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Added v2 query and streaming query endpoints with enhanced response handling and detailed error schemas.
  • Documentation

    • Updated API documentation to standardize v1 endpoints across guides.
    • Enhanced error documentation with detailed error payloads and comprehensive HTTP status code mappings.
    • Restructured API reference for improved clarity and better coverage of error cases.

✏️ Tip: You can customize this high-level summary in your review settings.

- Fix make target for openapi-doc
- Remove output.md
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 9, 2025

Walkthrough

This PR updates OpenAPI documentation generation configuration by switching to uv run wrappers, downgrades API version references from v2/v3 to v1 in conversations documentation, adds new v2 query endpoints to the OpenAPI schema, and refactors the markdown documentation to emphasize the Responses API.

Changes

Cohort / File(s) Summary
Build Configuration
Makefile
Replaces direct command invocations with uv run wrappers for OpenAPI docs generation; changes output path from docs/output.md to docs/openapi.md; standardizes flow to generate docs via uv run followed by cleanup.
API Documentation
docs/conversations_api.md
Downgrades all API version references from v2/v3 to v1 across endpoint paths, section headers, curl examples, and troubleshooting sections; normalizes request/response examples to reflect v1 endpoints.
OpenAPI Schema & Markdown Docs
docs/openapi.json, docs/openapi.md
Adds new /v2/query and /v2/streaming_query POST endpoints with mirrored v1 schemas, including request/response definitions and standard error responses (401, 403, 404, 422, 429, 500, 503); refactors markdown documentation to emphasize Responses API, updates endpoint headings to indicate versioned variants, expands error documentation with detailed payload examples, and adds new model sections.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Specific areas of attention:
    • docs/openapi.md contains extensive restructuring of endpoint descriptions, error mappings, and model sections—verify completeness and accuracy of schema changes
    • Consistency between /v2 endpoints in docs/openapi.json and their markdown documentation in docs/openapi.md
    • Version downgrade in docs/conversations_api.md (v2/v3 → v1)—confirm intentionality and impact on API surface

Possibly related PRs

Suggested reviewers

  • radofuchs
  • tisnik

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Fix api docs' is vague and generic, using non-descriptive language that doesn't convey the specific nature of the changes beyond a general fix to API documentation. Consider a more descriptive title that specifies the main changes, such as 'Update API documentation to v1 endpoints and fix make target for openapi-doc' or similar.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
Makefile (1)

44-47: Consider using python instead of python3 for consistency.

Line 46 uses uv run python3 scripts/fix_openapi_doc.py, but other Python targets in this Makefile use generic python invocation (e.g., line 18 uses python -m pytest, line 32 uses mypy as a standalone command). Explicitly specifying python3 may be intentional to ensure Python 3 is used, but verify this is necessary or align with the pattern used elsewhere.

The target logic is sound: generate to a temporary file, fix it, save to docs/openapi.md, and clean up. The uv run wrappers are consistent with other targets in the file.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2dcd9e9 and 67d09dd.

📒 Files selected for processing (4)
  • Makefile (1 hunks)
  • docs/conversations_api.md (11 hunks)
  • docs/openapi.json (1 hunks)
  • docs/openapi.md (38 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/conversations_api.md

24-24: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)


26-26: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)

docs/openapi.md

2619-2619: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)


3221-3221: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)


3316-3316: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)


3445-3445: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)


3589-3589: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: radon
  • GitHub Check: E2E: server mode / ci
  • GitHub Check: build-pr
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: library mode / azure
  • GitHub Check: E2E: server mode / azure
  • GitHub Check: bandit
  • GitHub Check: mypy
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (6)
docs/conversations_api.md (2)

194-269: API versioning updates are consistent and clear.

The endpoint documentation now explicitly designates v1 for all four API operations (Query, Streaming Query, List, and Detail), with matching updates to curl examples and internal cross-references. All endpoint paths correctly use /v1/ prefix, which aligns with the PR objectives and the conversations_api.md scope.

Also applies to: 311-410


495-495: Verify endpoint path in troubleshooting section.

Line 495 references /v1/conversations/{conversation_id} in the troubleshooting section. This is correct and consistent with the updated API versioning throughout the document.

docs/openapi.json (2)

2944-3253: v2 query endpoint labeled as v1 (tags/summary).

Fix tag and summary to reflect v2 so Swagger groups correctly:

-                "tags": [
-                    "query_v1"
-                ],
-                "summary": "Query Endpoint Handler V1",
+                "tags": [
+                    "query_v2"
+                ],
+                "summary": "Query Endpoint Handler V2",

The operationId query_endpoint_handler_v2_v2_query_post appears to contain a redundant "v2" that could be simplified to query_endpoint_handler_v2_query_post, though verify that no consumers depend on the current operationId format before renaming.


3131-3136: Typo in examples: "attatchment" → "attachment".

Correct spelling in 422 examples for both v2 endpoints at lines 3131-3136 and 3409-3415.

-                                                "cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
+                                                "cause": "Invalid attachment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
docs/openapi.md (2)

4020-4029: New APIKeyTokenConfiguration component properly documented.

The addition of the APIKeyTokenConfiguration component is appropriate for supporting API key authentication. Documentation is clear and consistent with other configuration models.


5139-5166: New tool call/result summary components well-integrated.

The addition of ToolCallSummary and ToolResultSummary components properly extends the QueryResponse to include detailed tool execution tracking. The field documentation is clear and the structure is consistent with the rest of the API schema.

Comment on lines +24 to +27
* [Query Endpoint (v1)](#query-endpoint-v1)
* [Streaming Query Endpoint (v1)](#streaming-query-endpoint-v1)
* [Conversations List Endpoint (v1)](#conversations-list-endpoint-v1)
* [Conversation Detail Endpoint (v1)](#conversation-detail-endpoint-v1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix unordered list indentation in table of contents.

The list items have 3-space indentation before the * marker, but markdown linting expects 2 spaces. Apply this diff to fix the indentation:

 * [API Endpoints](#api-endpoints)
-   * [Query Endpoint (v1)](#query-endpoint-v1)
-   * [Streaming Query Endpoint (v1)](#streaming-query-endpoint-v1)
-   * [Conversations List Endpoint (v1)](#conversations-list-endpoint-v1)
-   * [Conversation Detail Endpoint (v1)](#conversation-detail-endpoint-v1)
+  * [Query Endpoint (v1)](#query-endpoint-v1)
+  * [Streaming Query Endpoint (v1)](#streaming-query-endpoint-v1)
+  * [Conversations List Endpoint (v1)](#conversations-list-endpoint-v1)
+  * [Conversation Detail Endpoint (v1)](#conversation-detail-endpoint-v1)

Based on coding guidelines (markdownlint MD007), unordered lists should use consistent 2-space indentation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* [Query Endpoint (v1)](#query-endpoint-v1)
* [Streaming Query Endpoint (v1)](#streaming-query-endpoint-v1)
* [Conversations List Endpoint (v1)](#conversations-list-endpoint-v1)
* [Conversation Detail Endpoint (v1)](#conversation-detail-endpoint-v1)
* [Query Endpoint (v1)](#query-endpoint-v1)
* [Streaming Query Endpoint (v1)](#streaming-query-endpoint-v1)
* [Conversations List Endpoint (v1)](#conversations-list-endpoint-v1)
* [Conversation Detail Endpoint (v1)](#conversation-detail-endpoint-v1)
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

24-24: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)


26-26: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 2; Actual: 3

(MD007, ul-indent)

🤖 Prompt for AI Agents
In docs/conversations_api.md around lines 24 to 27, the table-of-contents
unordered list items use 3-space indentation before the '*' markers which
violates markdownlint MD007; change each list item to use exactly 2 spaces of
indentation before the '*' so the bullets align consistently (replace the
leading 3 spaces with 2 spaces on each listed line).

Comment on lines +3254 to +3531
"/v2/streaming_query": {
"post": {
"tags": [
"streaming_query_v1"
],
"summary": "Streaming Query Endpoint Handler V1",
"description": "Handle request to the /streaming_query endpoint using Responses API.\n\nReturns a streaming response using Server-Sent Events (SSE) format with\ncontent type text/event-stream.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle with content type\n text/event-stream.\n\nRaises:\n HTTPException:\n - 401: Unauthorized - Missing or invalid credentials\n - 403: Forbidden - Insufficient permissions or model override not allowed\n - 404: Not Found - Conversation, model, or provider not found\n - 422: Unprocessable Entity - Request validation failed\n - 429: Too Many Requests - Quota limit exceeded\n - 500: Internal Server Error - Configuration not loaded or other server errors\n - 503: Service Unavailable - Unable to connect to Llama Stack backend",
"operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
"requestBody": {
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/QueryRequest"
}
}
},
"required": true
},
"responses": {
"200": {
"description": "Successful response",
"content": {
"text/event-stream": {
"schema": {
"type": "string",
"format": "text/event-stream"
},
"example": "data: {\"event\": \"start\", \"data\": {\"conversation_id\": \"123e4567-e89b-12d3-a456-426614174000\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 0, \"token\": \"No Violation\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 1, \"token\": \"\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 2, \"token\": \"Hello\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 3, \"token\": \"!\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 4, \"token\": \" How\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 5, \"token\": \" can\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 6, \"token\": \" I\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 7, \"token\": \" assist\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 8, \"token\": \" you\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 9, \"token\": \" today\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 10, \"token\": \"?\"}}\n\ndata: {\"event\": \"turn_complete\", \"data\": {\"token\": \"Hello! How can I assist you today?\"}}\n\ndata: {\"event\": \"end\", \"data\": {\"referenced_documents\": [], \"truncated\": null, \"input_tokens\": 11, \"output_tokens\": 19}, \"available_quotas\": {}}\n\n"
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/UnauthorizedResponse"
},
"examples": {
"missing header": {
"value": {
"detail": {
"cause": "No Authorization header found",
"response": "Missing or invalid credentials provided by client"
}
}
},
"missing token": {
"value": {
"detail": {
"cause": "No token found in Authorization header",
"response": "Missing or invalid credentials provided by client"
}
}
}
}
}
}
},
"403": {
"description": "Permission denied",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ForbiddenResponse"
},
"examples": {
"conversation read": {
"value": {
"detail": {
"cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
"response": "User does not have permission to perform this action"
}
}
},
"endpoint": {
"value": {
"detail": {
"cause": "User 6789 is not authorized to access this endpoint.",
"response": "User does not have permission to access this endpoint"
}
}
},
"model override": {
"value": {
"detail": {
"cause": "User lacks model_override permission required to override model/provider.",
"response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
}
}
}
}
}
}
},
"404": {
"description": "Resource not found",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/NotFoundResponse"
},
"examples": {
"conversation": {
"value": {
"detail": {
"cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
"response": "Conversation not found"
}
}
},
"provider": {
"value": {
"detail": {
"cause": "Provider with ID openai does not exist",
"response": "Provider not found"
}
}
},
"model": {
"value": {
"detail": {
"cause": "Model with ID gpt-4-turbo is not configured",
"response": "Model not found"
}
}
}
}
}
}
},
"422": {
"description": "Request validation failed",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/UnprocessableEntityResponse"
},
"examples": {
"invalid format": {
"value": {
"detail": {
"cause": "Invalid request format. The request body could not be parsed.",
"response": "Invalid request format"
}
}
},
"missing attributes": {
"value": {
"detail": {
"cause": "Missing required attributes: ['query', 'model', 'provider']",
"response": "Missing required attributes"
}
}
},
"invalid value": {
"value": {
"detail": {
"cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
"response": "Invalid attribute value"
}
}
}
}
}
}
},
"429": {
"description": "Quota limit exceeded",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/QuotaExceededResponse"
},
"examples": {
"model": {
"value": {
"detail": {
"cause": "The token quota for model gpt-4-turbo has been exceeded.",
"response": "The model quota has been exceeded"
}
}
},
"user none": {
"value": {
"detail": {
"cause": "User 123 has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"cluster none": {
"value": {
"detail": {
"cause": "Cluster has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"subject none": {
"value": {
"detail": {
"cause": "Unknown subject 999 has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"user insufficient": {
"value": {
"detail": {
"cause": "User 123 has 5 tokens, but 10 tokens are needed.",
"response": "The quota has been exceeded"
}
}
},
"cluster insufficient": {
"value": {
"detail": {
"cause": "Cluster has 500 tokens, but 900 tokens are needed.",
"response": "The quota has been exceeded"
}
}
},
"subject insufficient": {
"value": {
"detail": {
"cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
"response": "The quota has been exceeded"
}
}
}
}
}
}
},
"500": {
"description": "Internal server error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/InternalServerErrorResponse"
},
"examples": {
"configuration": {
"value": {
"detail": {
"cause": "Lightspeed Stack configuration has not been initialized.",
"response": "Configuration is not loaded"
}
}
}
}
}
}
},
"503": {
"description": "Service unavailable",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ServiceUnavailableResponse"
},
"examples": {
"llama stack": {
"value": {
"detail": {
"cause": "Connection error while trying to reach backend service.",
"response": "Unable to connect to Llama Stack"
}
}
}
}
}
}
}
}
}
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "Duplicate operationIds:"
jq -r '..|objects|.operationId?|strings' docs/openapi.json | sort | uniq -d
echo ""
echo "Tag counts (v1 vs v2):"
jq -r '..|objects|.tags?|arrays|.[]' docs/openapi.json | sort | uniq -c | sort -nr
echo ""
echo "All streaming_query tags:"
jq -r '..|objects|.tags?|arrays|.[]|select(contains("streaming_query"))' docs/openapi.json | sort | uniq -c

Repository: lightspeed-core/lightspeed-stack

Length of output: 208


🏁 Script executed:

#!/bin/bash
echo "Duplicate operationIds:"
jq -r '..|objects|.operationId?|strings' docs/openapi.json | sort | uniq -d
echo ""
echo "Tag counts (v1 vs v2):"
jq -r '..|objects|.tags?|arrays|.[]' docs/openapi.json | sort | uniq -c | sort -nr
echo ""
echo "All streaming_query tags:"
jq -r '..|objects|.tags?|arrays|.[]|select(contains("streaming_query"))' docs/openapi.json | sort | uniq -c

Repository: lightspeed-core/lightspeed-stack

Length of output: 208


Update endpoint tags and summary to reflect v2 designation.

The /v2/streaming_query endpoint is tagged as "streaming_query_v1" with summary "Streaming Query Endpoint Handler V1". Update both to reflect v2:

-                "tags": [
-                    "streaming_query_v1"
-                ],
-                "summary": "Streaming Query Endpoint Handler V1",
+                "tags": [
+                    "streaming_query_v2"
+                ],
+                "summary": "Streaming Query Endpoint Handler V2",

Optional: Normalize the operationId to remove the redundant "v2_" prefix if no duplicate operationIds exist elsewhere in the spec.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"/v2/streaming_query": {
"post": {
"tags": [
"streaming_query_v1"
],
"summary": "Streaming Query Endpoint Handler V1",
"description": "Handle request to the /streaming_query endpoint using Responses API.\n\nReturns a streaming response using Server-Sent Events (SSE) format with\ncontent type text/event-stream.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle with content type\n text/event-stream.\n\nRaises:\n HTTPException:\n - 401: Unauthorized - Missing or invalid credentials\n - 403: Forbidden - Insufficient permissions or model override not allowed\n - 404: Not Found - Conversation, model, or provider not found\n - 422: Unprocessable Entity - Request validation failed\n - 429: Too Many Requests - Quota limit exceeded\n - 500: Internal Server Error - Configuration not loaded or other server errors\n - 503: Service Unavailable - Unable to connect to Llama Stack backend",
"operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
"requestBody": {
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/QueryRequest"
}
}
},
"required": true
},
"responses": {
"200": {
"description": "Successful response",
"content": {
"text/event-stream": {
"schema": {
"type": "string",
"format": "text/event-stream"
},
"example": "data: {\"event\": \"start\", \"data\": {\"conversation_id\": \"123e4567-e89b-12d3-a456-426614174000\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 0, \"token\": \"No Violation\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 1, \"token\": \"\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 2, \"token\": \"Hello\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 3, \"token\": \"!\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 4, \"token\": \" How\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 5, \"token\": \" can\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 6, \"token\": \" I\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 7, \"token\": \" assist\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 8, \"token\": \" you\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 9, \"token\": \" today\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 10, \"token\": \"?\"}}\n\ndata: {\"event\": \"turn_complete\", \"data\": {\"token\": \"Hello! How can I assist you today?\"}}\n\ndata: {\"event\": \"end\", \"data\": {\"referenced_documents\": [], \"truncated\": null, \"input_tokens\": 11, \"output_tokens\": 19}, \"available_quotas\": {}}\n\n"
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/UnauthorizedResponse"
},
"examples": {
"missing header": {
"value": {
"detail": {
"cause": "No Authorization header found",
"response": "Missing or invalid credentials provided by client"
}
}
},
"missing token": {
"value": {
"detail": {
"cause": "No token found in Authorization header",
"response": "Missing or invalid credentials provided by client"
}
}
}
}
}
}
},
"403": {
"description": "Permission denied",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ForbiddenResponse"
},
"examples": {
"conversation read": {
"value": {
"detail": {
"cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
"response": "User does not have permission to perform this action"
}
}
},
"endpoint": {
"value": {
"detail": {
"cause": "User 6789 is not authorized to access this endpoint.",
"response": "User does not have permission to access this endpoint"
}
}
},
"model override": {
"value": {
"detail": {
"cause": "User lacks model_override permission required to override model/provider.",
"response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
}
}
}
}
}
}
},
"404": {
"description": "Resource not found",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/NotFoundResponse"
},
"examples": {
"conversation": {
"value": {
"detail": {
"cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
"response": "Conversation not found"
}
}
},
"provider": {
"value": {
"detail": {
"cause": "Provider with ID openai does not exist",
"response": "Provider not found"
}
}
},
"model": {
"value": {
"detail": {
"cause": "Model with ID gpt-4-turbo is not configured",
"response": "Model not found"
}
}
}
}
}
}
},
"422": {
"description": "Request validation failed",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/UnprocessableEntityResponse"
},
"examples": {
"invalid format": {
"value": {
"detail": {
"cause": "Invalid request format. The request body could not be parsed.",
"response": "Invalid request format"
}
}
},
"missing attributes": {
"value": {
"detail": {
"cause": "Missing required attributes: ['query', 'model', 'provider']",
"response": "Missing required attributes"
}
}
},
"invalid value": {
"value": {
"detail": {
"cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
"response": "Invalid attribute value"
}
}
}
}
}
}
},
"429": {
"description": "Quota limit exceeded",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/QuotaExceededResponse"
},
"examples": {
"model": {
"value": {
"detail": {
"cause": "The token quota for model gpt-4-turbo has been exceeded.",
"response": "The model quota has been exceeded"
}
}
},
"user none": {
"value": {
"detail": {
"cause": "User 123 has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"cluster none": {
"value": {
"detail": {
"cause": "Cluster has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"subject none": {
"value": {
"detail": {
"cause": "Unknown subject 999 has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"user insufficient": {
"value": {
"detail": {
"cause": "User 123 has 5 tokens, but 10 tokens are needed.",
"response": "The quota has been exceeded"
}
}
},
"cluster insufficient": {
"value": {
"detail": {
"cause": "Cluster has 500 tokens, but 900 tokens are needed.",
"response": "The quota has been exceeded"
}
}
},
"subject insufficient": {
"value": {
"detail": {
"cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
"response": "The quota has been exceeded"
}
}
}
}
}
}
},
"500": {
"description": "Internal server error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/InternalServerErrorResponse"
},
"examples": {
"configuration": {
"value": {
"detail": {
"cause": "Lightspeed Stack configuration has not been initialized.",
"response": "Configuration is not loaded"
}
}
}
}
}
}
},
"503": {
"description": "Service unavailable",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ServiceUnavailableResponse"
},
"examples": {
"llama stack": {
"value": {
"detail": {
"cause": "Connection error while trying to reach backend service.",
"response": "Unable to connect to Llama Stack"
}
}
}
}
}
}
}
}
}
},
"/v2/streaming_query": {
"post": {
"tags": [
"streaming_query_v2"
],
"summary": "Streaming Query Endpoint Handler V2",
"description": "Handle request to the /streaming_query endpoint using Responses API.\n\nReturns a streaming response using Server-Sent Events (SSE) format with\ncontent type text/event-stream.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle with content type\n text/event-stream.\n\nRaises:\n HTTPException:\n - 401: Unauthorized - Missing or invalid credentials\n - 403: Forbidden - Insufficient permissions or model override not allowed\n - 404: Not Found - Conversation, model, or provider not found\n - 422: Unprocessable Entity - Request validation failed\n - 429: Too Many Requests - Quota limit exceeded\n - 500: Internal Server Error - Configuration not loaded or other server errors\n - 503: Service Unavailable - Unable to connect to Llama Stack backend",
"operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
"requestBody": {
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/QueryRequest"
}
}
},
"required": true
},
"responses": {
"200": {
"description": "Successful response",
"content": {
"text/event-stream": {
"schema": {
"type": "string",
"format": "text/event-stream"
},
"example": "data: {\"event\": \"start\", \"data\": {\"conversation_id\": \"123e4567-e89b-12d3-a456-426614174000\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 0, \"token\": \"No Violation\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 1, \"token\": \"\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 2, \"token\": \"Hello\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 3, \"token\": \"!\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 4, \"token\": \" How\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 5, \"token\": \" can\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 6, \"token\": \" I\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 7, \"token\": \" assist\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 8, \"token\": \" you\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 9, \"token\": \" today\"}}\n\ndata: {\"event\": \"token\", \"data\": {\"id\": 10, \"token\": \"?\"}}\n\ndata: {\"event\": \"turn_complete\", \"data\": {\"token\": \"Hello! How can I assist you today?\"}}\n\ndata: {\"event\": \"end\", \"data\": {\"referenced_documents\": [], \"truncated\": null, \"input_tokens\": 11, \"output_tokens\": 19}, \"available_quotas\": {}}\n\n"
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/UnauthorizedResponse"
},
"examples": {
"missing header": {
"value": {
"detail": {
"cause": "No Authorization header found",
"response": "Missing or invalid credentials provided by client"
}
}
},
"missing token": {
"value": {
"detail": {
"cause": "No token found in Authorization header",
"response": "Missing or invalid credentials provided by client"
}
}
}
}
}
}
},
"403": {
"description": "Permission denied",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ForbiddenResponse"
},
"examples": {
"conversation read": {
"value": {
"detail": {
"cause": "User 6789 does not have permission to read conversation with ID 123e4567-e89b-12d3-a456-426614174000",
"response": "User does not have permission to perform this action"
}
}
},
"endpoint": {
"value": {
"detail": {
"cause": "User 6789 is not authorized to access this endpoint.",
"response": "User does not have permission to access this endpoint"
}
}
},
"model override": {
"value": {
"detail": {
"cause": "User lacks model_override permission required to override model/provider.",
"response": "This instance does not permit overriding model/provider in the query request (missing permission: MODEL_OVERRIDE). Please remove the model and provider fields from your request."
}
}
}
}
}
}
},
"404": {
"description": "Resource not found",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/NotFoundResponse"
},
"examples": {
"conversation": {
"value": {
"detail": {
"cause": "Conversation with ID 123e4567-e89b-12d3-a456-426614174000 does not exist",
"response": "Conversation not found"
}
}
},
"provider": {
"value": {
"detail": {
"cause": "Provider with ID openai does not exist",
"response": "Provider not found"
}
}
},
"model": {
"value": {
"detail": {
"cause": "Model with ID gpt-4-turbo is not configured",
"response": "Model not found"
}
}
}
}
}
}
},
"422": {
"description": "Request validation failed",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/UnprocessableEntityResponse"
},
"examples": {
"invalid format": {
"value": {
"detail": {
"cause": "Invalid request format. The request body could not be parsed.",
"response": "Invalid request format"
}
}
},
"missing attributes": {
"value": {
"detail": {
"cause": "Missing required attributes: ['query', 'model', 'provider']",
"response": "Missing required attributes"
}
}
},
"invalid value": {
"value": {
"detail": {
"cause": "Invalid attatchment type: must be one of ['text/plain', 'application/json', 'application/yaml', 'application/xml']",
"response": "Invalid attribute value"
}
}
}
}
}
}
},
"429": {
"description": "Quota limit exceeded",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/QuotaExceededResponse"
},
"examples": {
"model": {
"value": {
"detail": {
"cause": "The token quota for model gpt-4-turbo has been exceeded.",
"response": "The model quota has been exceeded"
}
}
},
"user none": {
"value": {
"detail": {
"cause": "User 123 has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"cluster none": {
"value": {
"detail": {
"cause": "Cluster has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"subject none": {
"value": {
"detail": {
"cause": "Unknown subject 999 has no available tokens.",
"response": "The quota has been exceeded"
}
}
},
"user insufficient": {
"value": {
"detail": {
"cause": "User 123 has 5 tokens, but 10 tokens are needed.",
"response": "The quota has been exceeded"
}
}
},
"cluster insufficient": {
"value": {
"detail": {
"cause": "Cluster has 500 tokens, but 900 tokens are needed.",
"response": "The quota has been exceeded"
}
}
},
"subject insufficient": {
"value": {
"detail": {
"cause": "Unknown subject 999 has 3 tokens, but 6 tokens are needed.",
"response": "The quota has been exceeded"
}
}
}
}
}
}
},
"500": {
"description": "Internal server error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/InternalServerErrorResponse"
},
"examples": {
"configuration": {
"value": {
"detail": {
"cause": "Lightspeed Stack configuration has not been initialized.",
"response": "Configuration is not loaded"
}
}
}
}
}
}
},
"503": {
"description": "Service unavailable",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ServiceUnavailableResponse"
},
"examples": {
"llama stack": {
"value": {
"detail": {
"cause": "Connection error while trying to reach backend service.",
"response": "Unable to connect to Llama Stack"
}
}
}
}
}
}
}
}
}
},
🤖 Prompt for AI Agents
In docs/openapi.json around lines 3254-3531, the /v2/streaming_query operation
is incorrectly tagged and summarized as v1; update the "tags" entry from
"streaming_query_v1" to "streaming_query_v2" and change the "summary" value from
"Streaming Query Endpoint Handler V1" to "Streaming Query Endpoint Handler V2".
Also optionally simplify the operationId by removing the redundant leading "v2_"
prefix (e.g. "streaming_query_endpoint_handler_v2_streaming_query_post") if no
conflicting operationId exists elsewhere in the spec.

Comment on lines +2594 to 2607
## POST `/v2/query`

> **Delete Conversation Endpoint Handler**
> **Query Endpoint Handler V1**
Handle request to delete a conversation by ID.
Handle request to the /query endpoint using Responses API.

This is a wrapper around query_endpoint_handler_base that provides
the Responses API specific retrieve_response and get_topic_summary functions.

Returns:
QueryResponse: Contains the conversation ID and the LLM-generated response.


### 🔗 Parameters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

V2 endpoint incorrectly labeled as "V1" in description.

The /v2/query endpoint at line 2596 uses the heading "Query Endpoint Handler V1" which conflicts with the v2 path. This should be labeled "V2" to accurately reflect the endpoint version. The same issue appears at line 2896 for /v2/streaming_query.

Apply these diffs to correct the endpoint version labels:

-## POST `/v2/query`
+## POST `/v2/query`

-~
-> **Query Endpoint Handler V1**
+> **Query Endpoint Handler V2**
-## POST `/v2/streaming_query`
+## POST `/v2/streaming_query`

-~
-> **Streaming Query Endpoint Handler V1**
+> **Streaming Query Endpoint Handler V2**
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## POST `/v2/query`
> **Delete Conversation Endpoint Handler**
> **Query Endpoint Handler V1**
Handle request to delete a conversation by ID.
Handle request to the /query endpoint using Responses API.
This is a wrapper around query_endpoint_handler_base that provides
the Responses API specific retrieve_response and get_topic_summary functions.
Returns:
QueryResponse: Contains the conversation ID and the LLM-generated response.
### 🔗 Parameters
## POST `/v2/query`
> **Query Endpoint Handler V2**
Handle request to the /query endpoint using Responses API.
This is a wrapper around query_endpoint_handler_base that provides
the Responses API specific retrieve_response and get_topic_summary functions.
Returns:
QueryResponse: Contains the conversation ID and the LLM-generated response.
🤖 Prompt for AI Agents
In docs/openapi.md around lines 2594–2607 and again near line 2896, the headings
for the /v2 endpoints are incorrectly labeled "Query Endpoint Handler V1";
update those headings and any adjacent descriptive text to say "V2" (e.g.,
"Query Endpoint Handler V2" and "Streaming Query Endpoint Handler V2") so the
documentation version matches the /v2 paths; ensure any occurrences of "V1" in
those endpoint sections are likewise replaced with "V2".

|-------------|-------------|-----------|
| 200 | Successful response | [ConversationDeleteResponse](#conversationdeleteresponse)
| 200 | Successful response | [QueryResponse](#queryresponse) |
| 401 | Unauthorized | [UnauthorizedResponse](#unauthorizedresponse)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix missing trailing pipes in markdown tables.

The static analysis tool markdownlint has identified 5 tables with inconsistent markdown formatting. These tables are missing trailing pipes, which violates the MD055 rule. Lines affected: 2619, 3221, 3316, 3445, 3589.

Add trailing pipes to the response status code tables at the specified lines. For example, at line 2619, change:

-| 200 | Successful response | [QueryResponse](#queryresponse) |
+| 200 | Successful response | [QueryResponse](#queryresponse) |

All five occurrences follow the same pattern in the response tables for the new v2 endpoints.

Also applies to: 3221-3221, 3316-3316, 3445-3445, 3589-3589

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

2619-2619: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)

🤖 Prompt for AI Agents
docs/openapi.md around lines 2619, 3221, 3316, 3445, 3589: several markdown
table rows for response status codes are missing the trailing pipe character
(violates MD055) — update each affected line by appending a trailing pipe (" |")
to the end of the table row so the row becomes e.g. "| 401 | Unauthorized |
[UnauthorizedResponse](#unauthorizedresponse) |" and do the same for the four
other listed lines in their respective response tables for the v2 endpoints.

@onmete onmete closed this Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant