A sophisticated Retrieval-Augmented Generation (RAG) chatbot built with Node.js and Express, featuring real-time document processing, semantic search, and multiple LLM provider support.
Aaryan Choudhary
- Multi-Format Document Support: PDF, DOCX, TXT, CSV, HTML, JSON
- Advanced Text Processing: Intelligent chunking with overlap and semantic boundaries
- Real-time Communication: WebSocket-based streaming responses
- Multiple LLM Providers: OpenAI GPT, Google Gemini, Anthropic Claude, Cohere, HuggingFace
- Semantic Search: Vector-based document retrieval with cosine similarity
- Professional UI: Clean, responsive interface with drag-and-drop file upload
- Configurable Settings: Adjustable similarity thresholds and result limits
- Source Attribution: Transparent citation of information sources
- Rate Limiting: Built-in protection against abuse
- Comprehensive Logging: Detailed system monitoring and debugging
- Backend: Node.js, Express.js
- Real-time: WebSocket (ws)
- Document Processing: pdf-parse, mammoth, csv-parser
- Vector Operations: Custom implementation with cosine similarity
- Security: Helmet, CORS, rate limiting
- Frontend: Vanilla JavaScript, TailwindCSS
- Architecture: Modular service-based design
rag-qa-chatbot/
├── src/
│ ├── services/
│ │ ├── index.js # Service initialization
│ │ ├── documentProcessor.js # Document parsing and chunking
│ │ ├── embeddingService.js # Text embedding generation
│ │ ├── vectorStore.js # Vector storage and similarity search
│ │ ├── llmService.js # LLM provider integration
│ │ └── retrievalService.js # RAG pipeline orchestration
│ ├── routes/
│ │ ├── index.js # Route registration
│ │ ├── documents.js # Document management endpoints
│ │ ├── chat.js # Chat and query endpoints
│ │ └── admin.js # Administrative endpoints
│ ├── middleware/
│ │ ├── rateLimiter.js # Rate limiting middleware
│ │ ├── errorHandler.js # Global error handling
│ │ └── auth.js # Authentication middleware
│ ├── websocket/
│ │ └── chatSocket.js # WebSocket message handling
│ └── utils/
│ ├── logger.js # Logging utilities
│ └── textProcessor.js # Text processing utilities
├── public/
│ ├── index.html # Main application interface
│ ├── css/
│ │ └── styles.css # Application styling
│ └── js/
│ └── app.js # Frontend application logic
├── uploads/ # Document storage directory
├── .env.example # Environment configuration template
├── .gitignore # Git ignore rules
├── package.json # Project dependencies and scripts
├── server.js # Main application entry point
└── README.md # Project documentation
- Node.js (v18 or higher)
- npm or yarn package manager
-
Clone the repository
git clone https://github.com/yourusername/rag-qa-chatbot.git cd rag-qa-chatbot -
Install dependencies
npm install
-
Configure environment variables
cp .env.example .env
Edit
.envfile with your API keys:# Required for enhanced features OPENAI_API_KEY=your_openai_api_key_here GOOGLE_API_KEY=your_google_api_key_here ANTHROPIC_API_KEY=your_anthropic_api_key_here COHERE_API_KEY=your_cohere_api_key_here HUGGINGFACE_API_KEY=your_huggingface_api_key_here # Server configuration PORT=3000 NODE_ENV=development # Security RATE_LIMIT_WINDOW_MS=900000 RATE_LIMIT_MAX_REQUESTS=100
-
Start the application
npm start
-
Access the application Open your browser and navigate to
http://localhost:3000
npm run install-minimal
npm startnpm install -g nodemon
npm run dev- Via Web Interface: Drag and drop files onto the upload zone or click to browse
- Supported Formats: PDF, DOCX, TXT, CSV, HTML, JSON files
- Automatic Processing: Documents are automatically chunked and indexed
- Natural Language Queries: Ask questions in plain English
- Context-Aware Responses: Answers include source citations
- Real-time Streaming: Watch responses generate in real-time
- Follow-up Questions: Maintain conversation context
Access the settings panel to adjust:
- Retrieval Count: Number of relevant chunks to retrieve (1-20)
- Similarity Threshold: Minimum relevance score (0.0-1.0)
- Streaming Mode: Enable/disable real-time response streaming
- LLM Provider: Switch between different AI models
POST /api/documents/upload- Upload and process documentsGET /api/documents/- List all processed documentsDELETE /api/documents/:id- Remove document and its chunks
POST /api/chat/query- Send query and receive responsePOST /api/chat/stream- Streaming query endpointPOST /api/chat/related- Get suggested follow-up questions
POST /api/admin/llm/provider- Change LLM providerGET /api/admin/stats- System statisticsPOST /api/admin/clear- Clear all documents and chat history
{
"type": "chat",
"data": {
"message": "What is the main topic?",
"sessionId": "session-id",
"options": {
"topK": 5,
"threshold": 0.7
}
}
}{
"type": "chat_response",
"data": {
"message": "Based on the documents...",
"sources": [
{
"filename": "document.pdf",
"preview": "relevant text excerpt",
"similarity": 0.85
}
]
}
}| Variable | Description | Default | Required |
|---|---|---|---|
PORT |
Server port | 3000 | No |
NODE_ENV |
Environment mode | development | No |
OPENAI_API_KEY |
OpenAI API key | - | Optional |
GOOGLE_API_KEY |
Google Gemini API key | - | Optional |
ANTHROPIC_API_KEY |
Anthropic Claude API key | - | Optional |
COHERE_API_KEY |
Cohere API key | - | Optional |
HUGGINGFACE_API_KEY |
HuggingFace API key | - | Optional |
- Maximum File Size: 50MB per document
- Chunk Size: 500-1000 characters with 100 character overlap
- Vector Dimensions: 384 (sentence-transformers compatible)
- Maximum Documents: No hard limit (memory dependent)
- Supported Languages: Multi-language support via LLM providers
The application follows a modular, service-oriented architecture:
- Service Layer: Core business logic and data processing
- Route Layer: HTTP endpoint handling and validation
- WebSocket Layer: Real-time communication management
- Middleware Layer: Cross-cutting concerns (auth, logging, rate limiting)
- Frontend Layer: User interface and client-side logic
- New Document Format: Extend
documentProcessor.jsservice - New LLM Provider: Add integration to
llmService.js - Custom Embedding: Modify
embeddingService.js - UI Enhancements: Update
public/directory files
# Run basic functionality test
npm test
# Test document upload
curl -X POST -F "document=@test.pdf" http://localhost:3000/api/documents/upload
# Test query endpoint
curl -X POST -H "Content-Type: application/json" \
-d '{"message":"What is this document about?"}' \
http://localhost:3000/api/chat/queryNODE_ENV=production npm startFROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
CMD ["npm", "start"]The application can be deployed to any Node.js hosting platform:
- Heroku
- Vercel
- Railway
- Digital Ocean
- AWS Elastic Beanstalk
-
Port Already in Use
Error: listen EADDRINUSE :::3000
Solution: Change PORT in
.envor kill process using port 3000 -
Large File Upload Fails
Error: File too large
Solution: Adjust file size limits in
server.js -
API Key Errors
Error: Unauthorized - Invalid API key
Solution: Verify API keys in
.envfile -
Memory Issues with Large Documents
Error: JavaScript heap out of memory
Solution: Increase Node.js memory limit or process documents in smaller chunks
- Enable compression middleware for faster response times
- Implement document caching for frequently accessed files
- Use connection pooling for database operations
- Configure appropriate rate limiting based on usage patterns
- Fork the repository
- Create a feature branch (
git checkout -b feature/new-feature) - Commit your changes (
git commit -am 'Add new feature') - Push to the branch (
git push origin feature/new-feature) - Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with modern Node.js and Express.js
- Vector similarity search implementation
- Multiple LLM provider integrations
- Professional UI design with TailwindCSS
For questions, issues, or feature requests, please open an issue on GitHub or contact the developer.
Aaryan Choudhary - Professional RAG Q&A Chatbot System