diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 438bfe8..d501599 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -9,6 +9,17 @@ The API Gateway is a Node.js/Express-based microservice that serves as a proxy a **Runtime**: Node.js 18+ **Module System**: ES Modules (type: "module") +## Cursor IDE Support + +The API Gateway is fully compatible with Cursor IDE and other OpenAI-compatible clients. See [CURSOR_SETUP.md](./CURSOR_SETUP.md) for detailed setup instructions. + +**Key Features for Cursor:** +- OpenAI-compatible `/v1/chat/completions` endpoint +- `/v1/models` endpoint for model discovery +- Support for streaming and non-streaming responses +- Access to 40+ models from multiple providers +- Automatic failover for high reliability + --- ## System Architecture @@ -460,6 +471,66 @@ Legacy utility for file-based database operations (mostly superseded by reposito ## API Endpoints +### Models + +#### `GET /v1/models` +List all available models in OpenAI-compatible format. + +**Authentication**: None required (public endpoint) + +**Response**: +```json +{ + "object": "list", + "data": [ + { + "id": "gpt-4o", + "object": "model", + "created": 1234567890, + "owned_by": "openai" + }, + { + "id": "claude-sonnet-4", + "object": "model", + "created": 1234567890, + "owned_by": "anthropic" + } + ] +} +``` + +**Features**: +- Returns unique models (filters out provider variants like `_go`, `_guo`) +- Sorted alphabetically by model ID +- Compatible with Cursor IDE and other OpenAI clients + +#### `GET /v1/models/:model` +Get information about a specific model. + +**Authentication**: None required (public endpoint) + +**Response**: +```json +{ + "id": "gpt-4o", + "object": "model", + "created": 1234567890, + "owned_by": "openai" +} +``` + +**Error Response** (404): +```json +{ + "error": { + "message": "The model 'invalid-model' does not exist", + "type": "invalid_request_error", + "param": null, + "code": "model_not_found" + } +} +``` + ### Completions #### `POST /v1/chat/completions` diff --git a/CURSOR_SETUP.md b/CURSOR_SETUP.md new file mode 100644 index 0000000..ed8d83e --- /dev/null +++ b/CURSOR_SETUP.md @@ -0,0 +1,232 @@ +# Cursor IDE Setup Guide + +This guide explains how to configure Cursor IDE to use the Deep Assistant API Gateway as your OpenAI provider. + +## Overview + +Cursor IDE supports overriding the OpenAI base URL, which allows you to use the Deep Assistant API Gateway instead of the official OpenAI API. This gives you access to multiple LLM providers with automatic failover, cost optimization through the energy token system, and support for various models including GPT-4o, Claude, DeepSeek, and more. + +## Prerequisites + +1. Cursor IDE installed on your system +2. Access to a running Deep Assistant API Gateway instance +3. An admin token (API key) from the gateway administrator + +## Configuration Steps + +### 1. Open Cursor Settings + +In Cursor IDE: +- Click on the settings icon (gear icon) or press `Cmd+,` (Mac) / `Ctrl+,` (Windows/Linux) +- Navigate to the **Features** section +- Find **Models** or **OpenAI API Key** settings + +### 2. Configure API Key + +1. Enable the toggle for **"OpenAI API Key"** +2. Enter your API key (admin token) in the password field + - This should be your Deep Assistant admin token (not an OpenAI key) + +### 3. Override Base URL + +1. Enable the toggle for **"Override OpenAI Base URL (when using key)"** +2. Enter your API Gateway base URL in the format: + ``` + https://api.deep-foundation.tech/v1 + ``` + or for local development: + ``` + http://localhost:8088/v1 + ``` + +**Important:** The URL must end with `/v1` to match OpenAI's API structure. + +### 4. Save and Verify + +1. Click the **Verify** button next to the API key field to test the connection +2. If successful, Cursor will confirm that it can connect to your API Gateway +3. Click **Save** to apply the settings + +## Screenshot + +Here's what your configuration should look like: + +![Cursor Settings](https://github.com/user-attachments/assets/d31f2279-8b28-4c1f-aa0d-ea7caf8c24a6) + +## Available Models + +Once configured, you'll have access to all models supported by the API Gateway: + +### OpenAI Models +- `gpt-4o` - GPT-4 Omni (recommended for most tasks) +- `gpt-4o-mini` - Smaller, faster GPT-4 variant +- `gpt-3.5-turbo` - Fast and cost-effective +- `o1-preview` - Advanced reasoning model +- `o1-mini` - Smaller reasoning model +- `o3-mini` - Latest reasoning model + +### Anthropic Claude Models +- `claude-sonnet-4` - Latest Claude Sonnet +- `claude-3-7-sonnet` - Claude 3.7 Sonnet +- `claude-3-5-sonnet` - Claude 3.5 Sonnet +- `claude-3-5-haiku` - Fast Claude variant +- `claude-3-opus` - Most capable Claude model + +### DeepSeek Models +- `deepseek-chat` - DeepSeek conversational model +- `deepseek-reasoner` - DeepSeek reasoning model + +### Open Source Models +- `meta-llama/Meta-Llama-3.1-405B` - Large Llama model +- `meta-llama/Meta-Llama-3.1-70B` - Medium Llama model +- `meta-llama/Meta-Llama-3.1-8B` - Small Llama model +- `microsoft/WizardLM-2-8x22B` - WizardLM large +- `microsoft/WizardLM-2-7B` - WizardLM small + +### Other Models +- `gpt-4.1` - Custom GPT-4.1 (via GoAPI) +- `gpt-4.1-mini` - Custom GPT-4.1 mini +- `gpt-4.1-nano` - Custom GPT-4.1 nano +- `gpt-auto` - Automatic model selection +- `uncensored` - Uncensored small model + +## Testing Your Setup + +To verify your setup is working: + +1. Open a new chat in Cursor +2. Select one of the available models from the dropdown +3. Send a test message +4. You should receive a response from the API Gateway + +## Troubleshooting + +### Connection Errors + +**Issue:** "Problem reaching OpenAI" error + +**Solutions:** +1. Verify your base URL is correct and ends with `/v1` +2. Check that your API Gateway is running and accessible +3. Ensure your API key (admin token) is valid +4. Check firewall/network settings if using a remote gateway + +### Model Not Found + +**Issue:** Selected model returns "model not found" error + +**Solutions:** +1. Call `GET https://your-gateway-url/v1/models` to see available models +2. Verify the model name matches exactly (case-sensitive) +3. Check that the model is configured in your gateway's `llmsConfig.js` + +### Authentication Errors + +**Issue:** "Invalid token" or "Unauthorized" errors + +**Solutions:** +1. Verify you're using the admin token, not a user token +2. Check that the token matches the `ADMIN_FIRST` environment variable in your gateway +3. Ensure there are no extra spaces or characters in the API key field + +### Rate Limiting + +**Issue:** "Insufficient balance" or 429 errors + +**Solutions:** +1. Check your energy token balance: `GET /token?masterToken=YOUR_TOKEN&userId=YOUR_USER_ID` +2. Top up your balance if needed: `PUT /token` with appropriate parameters +3. Contact your gateway administrator for balance increase + +## Advanced Configuration + +### Using Different Providers + +The API Gateway automatically fails over between providers. The request flow is: + +1. Try primary provider (e.g., `gpt-4o_go` for GPT-4o) +2. If fails, try secondary provider (e.g., `gpt-4o` official) +3. Continue through provider chain until success or all fail + +You don't need to configure this - it happens automatically. + +### Energy Token System + +The gateway uses an internal "energy" currency: +- Different models have different energy conversion rates +- Higher efficiency models (like Claude Haiku) cost less energy +- Reasoning models (like o1-preview) cost more energy +- Check `ARCHITECTURE.md` for detailed pricing + +### Streaming Support + +The API Gateway supports both streaming and non-streaming responses: +- Most models support streaming (real-time response) +- Some models (o1, Claude) are automatically converted to non-streaming internally +- Cursor will automatically handle the appropriate mode + +## API Endpoint Reference + +The API Gateway implements the following OpenAI-compatible endpoints: + +- `GET /v1/models` - List all available models +- `GET /v1/models/{model}` - Get specific model information +- `POST /v1/chat/completions` - Create chat completions (main endpoint) +- `POST /v1/audio/transcriptions` - Audio to text (Whisper) +- `POST /v1/audio/speech` - Text to speech (TTS) + +## Support + +For issues or questions: + +1. Check the [API Gateway documentation](./ARCHITECTURE.md) +2. Review the [README](./README.md) for general setup +3. Open an issue on [GitHub](https://github.com/deep-assistant/api-gateway/issues) +4. Contact the Deep Assistant team + +## Security Notes + +1. **Keep your API key secure** - Never share it or commit it to version control +2. **Use HTTPS in production** - Always use encrypted connections for remote gateways +3. **Monitor usage** - Regularly check your token balance and usage patterns +4. **Rotate keys** - Periodically regenerate your API tokens for security + +## Migration from OpenAI + +If you're migrating from OpenAI's API: + +1. Your existing Cursor workflows will work the same way +2. Model names may differ slightly (check available models list) +3. You'll get access to more models and providers +4. Energy token system provides better cost management +5. Automatic failover improves reliability + +## Example: Complete Setup + +Here's a complete example configuration: + +``` +Settings > Features > Models + +✅ OpenAI API Key + API Key: •••••••••••••••••••••••••••••••• + (This is your ADMIN_FIRST token from the gateway) + +✅ Override OpenAI Base URL (when using key) + Base URL: https://api.deep-foundation.tech/v1 + +[Verify] [Save] +``` + +After saving, you can use any model from the dropdown in Cursor's chat interface. + +## What's Next? + +- Explore different models to find what works best for your use case +- Monitor your energy token usage +- Check out the [API documentation](./ARCHITECTURE.md) for advanced features +- Consider setting up your own API Gateway instance for more control + +--- + +*This guide is part of the Deep Assistant project. For more information, visit the [master-plan repository](https://github.com/deep-assistant/master-plan).* diff --git a/src/controllers/modelsController.js b/src/controllers/modelsController.js new file mode 100644 index 0000000..941edfd --- /dev/null +++ b/src/controllers/modelsController.js @@ -0,0 +1,138 @@ +import express from "express"; + +import { rest } from "../rest/rest.js"; +import { HttpResponse } from "../rest/HttpResponse.js"; +import { llmsConfig } from "../utils/llmsConfig.js"; + +const modelsController = express.Router(); + +/** + * GET /v1/models + * Returns a list of all available models in OpenAI-compatible format + * This endpoint is used by Cursor and other OpenAI-compatible clients + * to discover which models are available. + */ +modelsController.get( + "/v1/models", + rest(async ({ req }) => { + console.log("[ GET /v1/models ]"); + + // Get unique model names (filter out provider-specific variants like _go, _guo, etc.) + const uniqueModels = new Set(); + const modelData = []; + + for (const [key, config] of Object.entries(llmsConfig)) { + // Extract base model name (without provider suffix) + const baseModelName = key.replace(/_go$|_guo$|_openrouter$/, ""); + + // Only add unique base models to avoid duplicates + if (!uniqueModels.has(baseModelName)) { + uniqueModels.add(baseModelName); + + // Determine owner based on model name + let owner = "openai"; + if (baseModelName.startsWith("claude")) { + owner = "anthropic"; + } else if (baseModelName.startsWith("meta-llama")) { + owner = "meta"; + } else if (baseModelName.startsWith("microsoft")) { + owner = "microsoft"; + } else if (baseModelName.startsWith("deepseek")) { + owner = "deepseek"; + } else if (baseModelName.startsWith("gpt")) { + owner = "openai"; + } else if (baseModelName.startsWith("o1") || baseModelName.startsWith("o3")) { + owner = "openai"; + } else if (baseModelName === "uncensored") { + owner = "community"; + } + + modelData.push({ + id: baseModelName, + object: "model", + created: Math.floor(Date.now() / 1000), // Unix timestamp + owned_by: owner, + }); + } + } + + // Sort models alphabetically by id for consistency + modelData.sort((a, b) => a.id.localeCompare(b.id)); + + const response = { + object: "list", + data: modelData, + }; + + console.log(`[ returning ${modelData.length} unique models ]`); + return new HttpResponse(200, response); + }) +); + +/** + * GET /v1/models/:model + * Returns information about a specific model + * This is also part of the OpenAI-compatible API + */ +modelsController.get( + "/v1/models/:model", + rest(async ({ req }) => { + const modelId = req.params.model; + console.log(`[ GET /v1/models/${modelId} ]`); + + // Check if the model exists (with or without provider suffix) + let modelConfig = llmsConfig[modelId]; + + // If not found directly, try to find a variant with provider suffix + if (!modelConfig) { + const possibleKeys = Object.keys(llmsConfig).filter(key => + key === modelId || key.startsWith(modelId + "_") + ); + + if (possibleKeys.length > 0) { + modelConfig = llmsConfig[possibleKeys[0]]; + } + } + + if (!modelConfig) { + return new HttpResponse(404, { + error: { + message: `The model '${modelId}' does not exist`, + type: "invalid_request_error", + param: null, + code: "model_not_found", + }, + }); + } + + // Determine owner based on model name + let owner = "openai"; + if (modelId.startsWith("claude")) { + owner = "anthropic"; + } else if (modelId.startsWith("meta-llama")) { + owner = "meta"; + } else if (modelId.startsWith("microsoft")) { + owner = "microsoft"; + } else if (modelId.startsWith("deepseek")) { + owner = "deepseek"; + } else if (modelId.startsWith("gpt")) { + owner = "openai"; + } else if (modelId.startsWith("o1") || modelId.startsWith("o3")) { + owner = "openai"; + } else if (modelId === "uncensored") { + owner = "community"; + } + + const response = { + id: modelId, + object: "model", + created: Math.floor(Date.now() / 1000), + owned_by: owner, + }; + + console.log(`[ returning model info for ${modelId} ]`); + return new HttpResponse(200, response); + }) +); + +export default modelsController; diff --git a/src/server.js b/src/server.js index 43ad29c..3cad624 100644 --- a/src/server.js +++ b/src/server.js @@ -11,6 +11,7 @@ import referralController from "./controllers/referralController.js"; import dialogsController from "./controllers/dialogsController.js"; import transcriptionsController from "./controllers/transcriptionsController.js"; import speechController from "./controllers/speechController.js"; +import modelsController from "./controllers/modelsController.js"; const app = express(); const PORT = process.env.PORT; @@ -26,6 +27,7 @@ app.use("/", referralController); app.use("/", transcriptionsController); app.use("/", speechController); app.use("/", dialogsController); +app.use("/", modelsController); app.listen(PORT, () => { logger.info(`Server running on port ${PORT}`);