Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/auth.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,22 @@ authentication:
- Extracts user ID and username from configurable JWT claims
- Returns default credentials (guest-like) if no `Authorization` header present (guest access)

### API Key Token (`api-key-token`)

Authentication that checks a given API Key token is present as a Bearer token

**Configuration:**
```yaml
module: "api-key-token"
api_key_config:
api_key: "some-api-key"
```

**Behavior:**
- Extracts bearer token from the `Authorization` header
- Same user ID and username handling as `noop`
- Token is passed through and validated against the API Key given from configuration, for downstream use

## Authorization System

Authorization is controlled through role-based access control using two resolver types.
Expand Down
229 changes: 165 additions & 64 deletions docs/openapi.json

Large diffs are not rendered by default.

219 changes: 155 additions & 64 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -3868,6 +3868,17 @@ Examples



## APIKeyTokenConfiguration


API Key Token configuration.


| Field | Type | Description |
|-------|------|-------------|
| api_key | string | |


## AccessRule


Expand Down Expand Up @@ -3929,6 +3940,7 @@ Authentication configuration.
| k8s_cluster_api | | |
| k8s_ca_cert_path | | |
| jwk_config | | |
| api_key_config | | |
| rh_identity_config | | |


Expand Down Expand Up @@ -3994,13 +4006,23 @@ BYOK RAG configuration.

CORS configuration.

CORS or 'Cross-Origin Resource Sharing' refers to the situations when a
frontend running in a browser has JavaScript code that communicates with a
backend, and the backend is in a different 'origin' than the frontend.

Useful resources:

- [CORS in FastAPI](https://fastapi.tiangolo.com/tutorial/cors/)
- [Wikipedia article](https://en.wikipedia.org/wiki/Cross-origin_resource_sharing)
- [What is CORS?](https://dev.to/akshay_chauhan/what-is-cors-explained-8f1)


| Field | Type | Description |
|-------|------|-------------|
| allow_origins | array | |
| allow_credentials | boolean | |
| allow_methods | array | |
| allow_headers | array | |
| allow_origins | array | A list of origins allowed for cross-origin requests. An origin is the combination of protocol (http, https), domain (myapp.com, localhost, localhost.tiangolo.com), and port (80, 443, 8080). Use ['*'] to allow all origins. |
| allow_credentials | boolean | Indicate that cookies should be supported for cross-origin requests |
| allow_methods | array | A list of HTTP methods that should be allowed for cross-origin requests. You can use ['*'] to allow all standard methods. |
| allow_headers | array | A list of HTTP request headers that should be supported for cross-origin requests. You can use ['*'] to allow all headers. The Accept, Accept-Language, Content-Language and Content-Type headers are always allowed for simple CORS requests. |


## Configuration
Expand All @@ -4011,19 +4033,19 @@ Global service configuration.

| Field | Type | Description |
|-------|------|-------------|
| name | string | |
| service | | |
| llama_stack | | |
| user_data_collection | | |
| database | | |
| mcp_servers | array | |
| authentication | | |
| authorization | | |
| customization | | |
| inference | | |
| name | string | Name of the service. That value will be used in REST API endpoints. |
| service | | This section contains Lightspeed Core Stack service configuration. |
| llama_stack | | This section contains Llama Stack configuration. Lightspeed Core Stack service can call Llama Stack in library mode or in server mode. |
| user_data_collection | | This section contains configuration for subsystem that collects user data(transcription history and feedbacks). |
| database | | Configuration for database to store conversation IDs and other runtime data |
| mcp_servers | array | MCP (Model Context Protocol) servers provide tools and capabilities to the AI agents. These are configured in this section. Only MCP servers defined in the lightspeed-stack.yaml configuration are available to the agents. Tools configured in the llama-stack run.yaml are not accessible to lightspeed-core agents. |
| authentication | | Authentication configuration |
| authorization | | Lightspeed Core Stack implements a modular authentication and authorization system with multiple authentication methods. Authorization is configurable through role-based access control. Authentication is handled through selectable modules configured via the module field in the authentication configuration. |
| customization | | It is possible to customize Lightspeed Core Stack via this section. System prompt can be customized and also different parts of the service can be replaced by custom Python modules. |
| inference | | One LLM provider and one its model might be selected as default ones. When no provider+model pair is specified in REST API calls (query endpoints), the default provider and model are used. |
| conversation_cache | | |
| byok_rag | array | |
| quota_handlers | | |
| byok_rag | array | BYOK RAG configuration. This configuration can be used to reconfigure Llama Stack through its run.yaml configuration file |
| quota_handlers | | Quota handlers configuration |


## ConfigurationResponse
Expand Down Expand Up @@ -4102,7 +4124,7 @@ Attributes:
## ConversationHistoryConfiguration


Conversation cache configuration.
Conversation history configuration.


| Field | Type | Description |
Expand Down Expand Up @@ -4231,8 +4253,8 @@ Database configuration.

| Field | Type | Description |
|-------|------|-------------|
| sqlite | | |
| postgres | | |
| sqlite | | SQLite database configuration |
| postgres | | PostgreSQL database configuration |


## DetailModel
Expand Down Expand Up @@ -4373,7 +4395,7 @@ In-memory cache configuration.

| Field | Type | Description |
|-------|------|-------------|
| max_entries | integer | |
| max_entries | integer | Maximum number of entries stored in the in-memory cache |


## InferenceConfiguration
Expand Down Expand Up @@ -4459,11 +4481,11 @@ Rule for extracting roles from JWT claims.

| Field | Type | Description |
|-------|------|-------------|
| jsonpath | string | |
| operator | | |
| negate | boolean | |
| value | | |
| roles | array | |
| jsonpath | string | JSONPath expression to evaluate against the JWT payload |
| operator | | JSON path comparison operator |
| negate | boolean | If set to true, the meaning of the rule is negated |
| value | | Value to compare against |
| roles | array | Roles to be assigned if the rule matches |


## LivenessResponse
Expand All @@ -4485,26 +4507,49 @@ Attributes:

Llama stack configuration.

Llama Stack is a comprehensive system that provides a uniform set of tools
for building, scaling, and deploying generative AI applications, enabling
developers to create, integrate, and orchestrate multiple AI services and
capabilities into an adaptable setup.

Useful resources:

- [Llama Stack](https://www.llama.com/products/llama-stack/)
- [Python Llama Stack client](https://github.com/llamastack/llama-stack-client-python)
- [Build AI Applications with Llama Stack](https://llamastack.github.io/)


| Field | Type | Description |
|-------|------|-------------|
| url | | |
| api_key | | |
| use_as_library_client | | |
| library_client_config_path | | |
| url | | URL to Llama Stack service; used when library mode is disabled |
| api_key | | API key to access Llama Stack service |
| use_as_library_client | | When set to true Llama Stack will be used in library mode, not in server mode (default) |
| library_client_config_path | | Path to configuration file used when Llama Stack is run in library mode |


## ModelContextProtocolServer


model context protocol server configuration.
Model context protocol server configuration.

MCP (Model Context Protocol) servers provide tools and
capabilities to the AI agents. These are configured by this structure.
Only MCP servers defined in the lightspeed-stack.yaml configuration are
available to the agents. Tools configured in the llama-stack run.yaml
are not accessible to lightspeed-core agents.

Useful resources:

- [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro)
- [MCP FAQs](https://modelcontextprotocol.io/faqs)
- [Wikipedia article](https://en.wikipedia.org/wiki/Model_Context_Protocol)


| Field | Type | Description |
|-------|------|-------------|
| name | string | |
| provider_id | string | |
| url | string | |
| name | string | MCP server name that must be unique |
| provider_id | string | MCP provider identification |
| url | string | URL of the MCP server |


## ModelsResponse
Expand Down Expand Up @@ -4535,18 +4580,28 @@ Model representing a response to models request.

PostgreSQL database configuration.

PostgreSQL database is used by Lightspeed Core Stack service for storing information about
conversation IDs. It can also be leveraged to store conversation history and information
about quota usage.

Useful resources:

- [Psycopg: connection classes](https://www.psycopg.org/psycopg3/docs/api/connections.html)
- [PostgreSQL connection strings](https://www.connectionstrings.com/postgresql/)
- [How to Use PostgreSQL in Python](https://www.freecodecamp.org/news/postgresql-in-python/)


| Field | Type | Description |
|-------|------|-------------|
| host | string | |
| port | integer | |
| db | string | |
| user | string | |
| password | string | |
| namespace | | |
| ssl_mode | string | |
| gss_encmode | string | |
| ca_cert_path | | |
| host | string | Database server host or socket directory |
| port | integer | Database server port |
| db | string | Database name to connect to |
| user | string | Database user name used to authenticate |
| password | string | Password used to authenticate |
| namespace | | Database namespace |
| ssl_mode | string | SSL mode |
| gss_encmode | string | This option determines whether or with what priority a secure GSS TCP/IP connection will be negotiated with the server. |
| ca_cert_path | | Path to CA certificate |


## ProviderHealthStatus
Expand Down Expand Up @@ -4675,29 +4730,52 @@ Attributes:

Quota limiter configuration.

It is possible to limit quota usage per user or per service or services
(that typically run in one cluster). Each limit is configured as a separate
_quota limiter_. It can be of type `user_limiter` or `cluster_limiter`
(which is name that makes sense in OpenShift deployment).


| Field | Type | Description |
|-------|------|-------------|
| sqlite | | |
| postgres | | |
| limiters | array | |
| scheduler | | |
| enable_token_history | boolean | |
| sqlite | | SQLite database configuration |
| postgres | | PostgreSQL database configuration |
| limiters | array | Quota limiters configuration |
| scheduler | | Quota scheduler configuration |
| enable_token_history | boolean | Enables storing information about token usage history |


## QuotaLimiterConfiguration


Configuration for one quota limiter.

There are three configuration options for each limiter:

1. ``period`` is specified in a human-readable form, see
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
for all possible options. When the end of the period is reached, the
quota is reset or increased.
2. ``initial_quota`` is the value set at the beginning of the period.
3. ``quota_increase`` is the value (if specified) used to increase the
quota when the period is reached.

There are two basic use cases:

1. When the quota needs to be reset to a specific value periodically (for
example on a weekly or monthly basis), set ``initial_quota`` to the
required value.
2. When the quota needs to be increased by a specific value periodically
(for example on a daily basis), set ``quota_increase``.


| Field | Type | Description |
|-------|------|-------------|
| type | string | |
| name | string | |
| initial_quota | integer | |
| quota_increase | integer | |
| period | string | |
| type | string | Quota limiter type, either user_limiter or cluster_limiter |
| name | string | Human readable quota limiter name |
| initial_quota | integer | Quota set at beginning of the period |
| quota_increase | integer | Delta value used to increase quota when period is reached |
| period | string | Period specified in human readable form |


## QuotaSchedulerConfiguration
Expand All @@ -4708,7 +4786,7 @@ Quota scheduler configuration.

| Field | Type | Description |
|-------|------|-------------|
| period | integer | |
| period | integer | Quota scheduler period specified in seconds |


## RAGChunk
Expand Down Expand Up @@ -4814,17 +4892,22 @@ SQLite database configuration.

Service configuration.

Lightspeed Core Stack is a REST API service that accepts requests
on a specified hostname and port. It is also possible to enable
authentication and specify the number of Uvicorn workers. When more
workers are specified, the service can handle requests concurrently.


| Field | Type | Description |
|-------|------|-------------|
| host | string | |
| port | integer | |
| auth_enabled | boolean | |
| workers | integer | |
| color_log | boolean | |
| access_log | boolean | |
| tls_config | | |
| cors | | |
| host | string | Service hostname |
| port | integer | Service port |
| auth_enabled | boolean | Enables the authentication subsystem |
| workers | integer | Number of Uvicorn worker processes to start |
| color_log | boolean | Enables colorized logging |
| access_log | boolean | Enables logging of all access information |
| tls_config | | Transport Layer Security configuration for HTTPS support |
| cors | | Cross-Origin Resource Sharing configuration for cross-domain requests |


## ServiceUnavailableResponse
Expand Down Expand Up @@ -4871,9 +4954,17 @@ Attributes:

TLS configuration.

See also:
- https://fastapi.tiangolo.com/deployment/https/
- https://en.wikipedia.org/wiki/Transport_Layer_Security
Transport Layer Security (TLS) is a cryptographic protocol designed to
provide communications security over a computer network, such as the
Internet. The protocol is widely used in applications such as email,
instant messaging, and voice over IP, but its use in securing HTTPS remains
the most publicly visible.

Useful resources:

- [FastAPI HTTPS Deployment](https://fastapi.tiangolo.com/deployment/https/)
- [Transport Layer Security Overview](https://en.wikipedia.org/wiki/Transport_Layer_Security)
- [What is TLS](https://www.ssltrust.eu/learning/ssl/transport-layer-security-tls)


| Field | Type | Description |
Expand Down
16 changes: 16 additions & 0 deletions examples/lightspeed-stack-api-key-auth.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: API Key Token Authentication Example
service:
host: localhost
port: 8080
auth_enabled: true
workers: 1
color_log: true
access_log: true
llama_stack:
use_as_library_client: false
url: http://localhost:8321
authentication:
module: "api-key-token"
api_key_config:
api_key: "some-api-key"

Loading
Loading