lightspeed-core · tisnik · Dec 4, 2025 · Nov 27, 2025 · Dec 1, 2025 · Dec 1, 2025
diff --git a/docs/auth.md b/docs/auth.md
@@ -142,6 +142,22 @@ authentication:
 - Extracts user ID and username from configurable JWT claims
 - Returns default credentials (guest-like) if no `Authorization` header present (guest access)
 
+### API Key Token (`api-key-token`)
+
+Authentication that checks a given API Key token is present as a Bearer token
+
+**Configuration:**
+```yaml
+  module: "api-key-token"
+  api_key_config:
+    api_key: "some-api-key"
+```
+
+**Behavior:**
+- Extracts bearer token from the `Authorization` header
+- Same user ID and username handling as `noop`
+- Token is passed through and validated against the API Key given from configuration, for downstream use
+
 ## Authorization System
 
 Authorization is controlled through role-based access control using two resolver types.

diff --git a/docs/openapi.json b/docs/openapi.json
diff --git a/docs/output.md b/docs/output.md
@@ -3868,6 +3868,17 @@ Examples
 
 
 
+## APIKeyTokenConfiguration
+
+
+API Key Token configuration.
+
+
+| Field | Type | Description |
+|-------|------|-------------|
+| api_key | string |  |
+
+
 ## AccessRule
 
 
@@ -3929,6 +3940,7 @@ Authentication configuration.
 | k8s_cluster_api |  |  |
 | k8s_ca_cert_path |  |  |
 | jwk_config |  |  |
+| api_key_config |  |  |
 | rh_identity_config |  |  |
 
 
@@ -3994,13 +4006,23 @@ BYOK RAG configuration.
 
 CORS configuration.
 
+CORS or 'Cross-Origin Resource Sharing' refers to the situations when a
+frontend running in a browser has JavaScript code that communicates with a
+backend, and the backend is in a different 'origin' than the frontend.
+
+Useful resources:
+
+  - [CORS in FastAPI](https://fastapi.tiangolo.com/tutorial/cors/)
+  - [Wikipedia article](https://en.wikipedia.org/wiki/Cross-origin_resource_sharing)
+  - [What is CORS?](https://dev.to/akshay_chauhan/what-is-cors-explained-8f1)
+
 
 | Field | Type | Description |
 |-------|------|-------------|
-| allow_origins | array |  |
-| allow_credentials | boolean |  |
-| allow_methods | array |  |
-| allow_headers | array |  |
+| allow_origins | array | A list of origins allowed for cross-origin requests. An origin is the combination of protocol (http, https), domain (myapp.com, localhost, localhost.tiangolo.com), and port (80, 443, 8080). Use ['*'] to allow all origins. |
+| allow_credentials | boolean | Indicate that cookies should be supported for cross-origin requests |
+| allow_methods | array | A list of HTTP methods that should be allowed for cross-origin requests. You can use ['*'] to allow all standard methods. |
+| allow_headers | array | A list of HTTP request headers that should be supported for cross-origin requests. You can use ['*'] to allow all headers. The Accept, Accept-Language, Content-Language and Content-Type headers are always allowed for simple CORS requests. |
 
 
 ## Configuration
@@ -4011,19 +4033,19 @@ Global service configuration.
 
 | Field | Type | Description |
 |-------|------|-------------|
-| name | string |  |
-| service |  |  |
-| llama_stack |  |  |
-| user_data_collection |  |  |
-| database |  |  |
-| mcp_servers | array |  |
-| authentication |  |  |
-| authorization |  |  |
-| customization |  |  |
-| inference |  |  |
+| name | string | Name of the service. That value will be used in REST API endpoints. |
+| service |  | This section contains Lightspeed Core Stack service configuration. |
+| llama_stack |  | This section contains Llama Stack configuration. Lightspeed Core Stack service can call Llama Stack in library mode or in server mode. |
+| user_data_collection |  | This section contains configuration for subsystem that collects user data(transcription history and feedbacks). |
+| database |  | Configuration for database to store conversation IDs and other runtime data |
+| mcp_servers | array | MCP (Model Context Protocol) servers provide tools and capabilities to the AI agents. These are configured in this section. Only MCP servers defined in the lightspeed-stack.yaml configuration are available to the agents. Tools configured in the llama-stack run.yaml are not accessible to lightspeed-core agents. |
+| authentication |  | Authentication configuration |
+| authorization |  | Lightspeed Core Stack implements a modular authentication and authorization system with multiple authentication methods. Authorization is configurable through role-based access control. Authentication is handled through selectable modules configured via the module field in the authentication configuration. |
+| customization |  | It is possible to customize Lightspeed Core Stack via this section. System prompt can be customized and also different parts of the service can be replaced by custom Python modules. |
+| inference |  | One LLM provider and one its model might be selected as default ones. When no provider+model pair is specified in REST API calls (query endpoints), the default provider and model are used. |
 | conversation_cache |  |  |
-| byok_rag | array |  |
-| quota_handlers |  |  |
+| byok_rag | array | BYOK RAG configuration. This configuration can be used to reconfigure Llama Stack through its run.yaml configuration file |
+| quota_handlers |  | Quota handlers configuration |
 
 
 ## ConfigurationResponse
@@ -4102,7 +4124,7 @@ Attributes:
 ## ConversationHistoryConfiguration
 
 
-Conversation cache configuration.
+Conversation history configuration.
 
 
 | Field | Type | Description |
@@ -4231,8 +4253,8 @@ Database configuration.
 
 | Field | Type | Description |
 |-------|------|-------------|
-| sqlite |  |  |
-| postgres |  |  |
+| sqlite |  | SQLite database configuration |
+| postgres |  | PostgreSQL database configuration |
 
 
 ## DetailModel
@@ -4373,7 +4395,7 @@ In-memory cache configuration.
 
 | Field | Type | Description |
 |-------|------|-------------|
-| max_entries | integer |  |
+| max_entries | integer | Maximum number of entries stored in the in-memory cache |
 
 
 ## InferenceConfiguration
@@ -4459,11 +4481,11 @@ Rule for extracting roles from JWT claims.
 
 | Field | Type | Description |
 |-------|------|-------------|
-| jsonpath | string |  |
-| operator |  |  |
-| negate | boolean |  |
-| value |  |  |
-| roles | array |  |
+| jsonpath | string | JSONPath expression to evaluate against the JWT payload |
+| operator |  | JSON path comparison operator |
+| negate | boolean | If set to true, the meaning of the rule is negated |
+| value |  | Value to compare against |
+| roles | array | Roles to be assigned if the rule matches |
 
 
 ## LivenessResponse
@@ -4485,26 +4507,49 @@ Attributes:
 
 Llama stack configuration.
 
+Llama Stack is a comprehensive system that provides a uniform set of tools
+for building, scaling, and deploying generative AI applications, enabling
+developers to create, integrate, and orchestrate multiple AI services and
+capabilities into an adaptable setup.
+
+Useful resources:
+
+  - [Llama Stack](https://www.llama.com/products/llama-stack/)
+  - [Python Llama Stack client](https://github.com/llamastack/llama-stack-client-python)
+  - [Build AI Applications with Llama Stack](https://llamastack.github.io/)
+
 
 | Field | Type | Description |
 |-------|------|-------------|
-| url |  |  |
-| api_key |  |  |
-| use_as_library_client |  |  |
-| library_client_config_path |  |  |
+| url |  | URL to Llama Stack service; used when library mode is disabled |
+| api_key |  | API key to access Llama Stack service |
+| use_as_library_client |  | When set to true Llama Stack will be used in library mode, not in server mode (default) |
+| library_client_config_path |  | Path to configuration file used when Llama Stack is run in library mode |
 
 
 ## ModelContextProtocolServer
 
 
-model context protocol server configuration.
+Model context protocol server configuration.
+
+MCP (Model Context Protocol) servers provide tools and
+capabilities to the AI agents. These are configured by this structure.
+Only MCP servers defined in the lightspeed-stack.yaml configuration are
+available to the agents. Tools configured in the llama-stack run.yaml
+are not accessible to lightspeed-core agents.
+
+Useful resources:
+
+- [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro)
+- [MCP FAQs](https://modelcontextprotocol.io/faqs)
+- [Wikipedia article](https://en.wikipedia.org/wiki/Model_Context_Protocol)
 
 
 | Field | Type | Description |
 |-------|------|-------------|
-| name | string |  |
-| provider_id | string |  |
-| url | string |  |
+| name | string | MCP server name that must be unique |
+| provider_id | string | MCP provider identification |
+| url | string | URL of the MCP server |
 
 
 ## ModelsResponse
@@ -4535,18 +4580,28 @@ Model representing a response to models request.
 
 PostgreSQL database configuration.
 
+PostgreSQL database is used by Lightspeed Core Stack service for storing information about
+conversation IDs. It can also be leveraged to store conversation history and information
+about quota usage.
+
+Useful resources:
+
+- [Psycopg: connection classes](https://www.psycopg.org/psycopg3/docs/api/connections.html)
+- [PostgreSQL connection strings](https://www.connectionstrings.com/postgresql/)
+- [How to Use PostgreSQL in Python](https://www.freecodecamp.org/news/postgresql-in-python/)
+
 
 | Field | Type | Description |
 |-------|------|-------------|
-| host | string |  |
-| port | integer |  |
-| db | string |  |
-| user | string |  |
-| password | string |  |
-| namespace |  |  |
-| ssl_mode | string |  |
-| gss_encmode | string |  |
-| ca_cert_path |  |  |
+| host | string | Database server host or socket directory |
+| port | integer | Database server port |
+| db | string | Database name to connect to |
+| user | string | Database user name used to authenticate |
+| password | string | Password used to authenticate |
+| namespace |  | Database namespace |
+| ssl_mode | string | SSL mode |
+| gss_encmode | string | This option determines whether or with what priority a secure GSS TCP/IP connection will be negotiated with the server. |
+| ca_cert_path |  | Path to CA certificate |
 
 
 ## ProviderHealthStatus
@@ -4675,29 +4730,52 @@ Attributes:
 
 Quota limiter configuration.
 
+It is possible to limit quota usage per user or per service or services
+(that typically run in one cluster). Each limit is configured as a separate
+_quota limiter_. It can be of type `user_limiter` or `cluster_limiter`
+(which is name that makes sense in OpenShift deployment).
+
 
 | Field | Type | Description |
 |-------|------|-------------|
-| sqlite |  |  |
-| postgres |  |  |
-| limiters | array |  |
-| scheduler |  |  |
-| enable_token_history | boolean |  |
+| sqlite |  | SQLite database configuration |
+| postgres |  | PostgreSQL database configuration |
+| limiters | array | Quota limiters configuration |
+| scheduler |  | Quota scheduler configuration |
+| enable_token_history | boolean | Enables storing information about token usage history |
 
 
 ## QuotaLimiterConfiguration
 
 
 Configuration for one quota limiter.
 
+There are three configuration options for each limiter:
+
+1. ``period`` is specified in a human-readable form, see
+   https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
+   for all possible options. When the end of the period is reached, the
+   quota is reset or increased.
+2. ``initial_quota`` is the value set at the beginning of the period.
+3. ``quota_increase`` is the value (if specified) used to increase the
+   quota when the period is reached.
+
+There are two basic use cases:
+
+1. When the quota needs to be reset to a specific value periodically (for
+   example on a weekly or monthly basis), set ``initial_quota`` to the
+   required value.
+2. When the quota needs to be increased by a specific value periodically
+   (for example on a daily basis), set ``quota_increase``.
+
 
 | Field | Type | Description |
 |-------|------|-------------|
-| type | string |  |
-| name | string |  |
-| initial_quota | integer |  |
-| quota_increase | integer |  |
-| period | string |  |
+| type | string | Quota limiter type, either user_limiter or cluster_limiter |
+| name | string | Human readable quota limiter name |
+| initial_quota | integer | Quota set at beginning of the period |
+| quota_increase | integer | Delta value used to increase quota when period is reached |
+| period | string | Period specified in human readable form |
 
 
 ## QuotaSchedulerConfiguration
@@ -4708,7 +4786,7 @@ Quota scheduler configuration.
 
 | Field | Type | Description |
 |-------|------|-------------|
-| period | integer |  |
+| period | integer | Quota scheduler period specified in seconds |
 
 
 ## RAGChunk
@@ -4814,17 +4892,22 @@ SQLite database configuration.
 
 Service configuration.
 
+Lightspeed Core Stack is a REST API service that accepts requests
+on a specified hostname and port. It is also possible to enable
+authentication and specify the number of Uvicorn workers. When more
+workers are specified, the service can handle requests concurrently.
+
 
 | Field | Type | Description |
 |-------|------|-------------|
-| host | string |  |
-| port | integer |  |
-| auth_enabled | boolean |  |
-| workers | integer |  |
-| color_log | boolean |  |
-| access_log | boolean |  |
-| tls_config |  |  |
-| cors |  |  |
+| host | string | Service hostname |
+| port | integer | Service port |
+| auth_enabled | boolean | Enables the authentication subsystem |
+| workers | integer | Number of Uvicorn worker processes to start |
+| color_log | boolean | Enables colorized logging |
+| access_log | boolean | Enables logging of all access information |
+| tls_config |  | Transport Layer Security configuration for HTTPS support |
+| cors |  | Cross-Origin Resource Sharing configuration for cross-domain requests |
 
 
 ## ServiceUnavailableResponse
@@ -4871,9 +4954,17 @@ Attributes:
 
 TLS configuration.
 
-See also:
-- https://fastapi.tiangolo.com/deployment/https/
-- https://en.wikipedia.org/wiki/Transport_Layer_Security
+Transport Layer Security (TLS) is a cryptographic protocol designed to
+provide communications security over a computer network, such as the
+Internet. The protocol is widely used in applications such as email,
+instant messaging, and voice over IP, but its use in securing HTTPS remains
+the most publicly visible.
+
+Useful resources:
+
+  - [FastAPI HTTPS Deployment](https://fastapi.tiangolo.com/deployment/https/)
+  - [Transport Layer Security Overview](https://en.wikipedia.org/wiki/Transport_Layer_Security)
+  - [What is TLS](https://www.ssltrust.eu/learning/ssl/transport-layer-security-tls)
 
 
 | Field | Type | Description |

diff --git a/examples/lightspeed-stack-api-key-auth.yaml b/examples/lightspeed-stack-api-key-auth.yaml
@@ -0,0 +1,16 @@
+name: API Key Token Authentication Example
+service:
+  host: localhost
+  port: 8080
+  auth_enabled: true
+  workers: 1
+  color_log: true
+  access_log: true
+llama_stack:
+  use_as_library_client: false
+  url: http://localhost:8321
+authentication:
+  module: "api-key-token"
+  api_key_config:
+    api_key: "some-api-key"
+