🎬 Video Intelligence App

An AI-powered video search system that allows users to upload videos, extract intelligent insights from frames, and search through video content using natural language queries.

✨ Features

Video Upload: Drag-and-drop interface for video files (MP4, AVI, MOV, etc.)
Real-time Processing: Live progress updates during video analysis
AI-Powered Analysis:
- Frame extraction at regular intervals
- Detailed scene descriptions using GPT-4 Vision
- Semantic embeddings using Voyage AI multimodal models
Multiple Search Types:
- Hybrid Search: Combines semantic and text-based search for best results (default)
- Semantic Search: AI-powered similarity search using embeddings
- Text Search: Traditional keyword search through frame descriptions
Interactive Results: Thumbnail grid showing matching frames with normalized similarity scores
Collapsible Interface: Toggle sections (upload area, video playback, previously uploaded videos) for better space management
Video Player Integration: Click frames to jump to specific timestamps in the video
Video Management: Upload new videos or select from previously uploaded ones
MongoDB Storage: Scalable storage with vector search capabilities

🏗️ Architecture

├── backend/                 # Python FastAPI backend
│   ├── services/           # Core business logic
│   │   ├── ai_service.py          # OpenAI & Voyage AI integration
│   │   ├── mongodb_service.py     # Database operations
│   │   └── video_processor.py     # Video frame extraction
│   ├── models/             # Pydantic schemas
│   └── main.py            # FastAPI application
├── frontend/               # React frontend
│   ├── src/
│   │   ├── components/    # React components
│   │   ├── services/      # API client
│   │   └── styles/        # CSS styles
└── uploads/               # Temporary video storage

🚀 Quick Start

Prerequisites

Python 3.8+
Node.js 16+
MongoDB Atlas account
OpenAI API key
Voyage AI API key

1. MongoDB Atlas Setup

Create a MongoDB Atlas account
Create a new cluster
Create a database user with read/write permissions
Get your connection string
Create the following indexes in your database:

Vector Search Index

In Atlas UI, create a search index with this definition:

{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}

Name it vector_search_index

Text Search Index

Create another search index:

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "description": {
        "type": "string",
        "analyzer": "lucene.standard"
      },
      "metadata.scene_type": {
        "type": "string"
      },
      "metadata.objects": {
        "type": "string"
      }
    }
  }
}

Name it text_search_index

2. Backend Setup

Navigate to the backend directory:

cd apps/video-intelligence/backend

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Create environment file:

cp .env.example .env

Edit .env with your credentials:

# MongoDB Atlas Configuration
MONGODB_URI=mongodb+srv://<username>:<password>@<cluster-url>/<database>?retryWrites=true&w=majority
DATABASE_NAME=video_intelligence

# AI Service API Keys
VOYAGE_AI_API_KEY=your_voyage_ai_api_key_here
OPENAI_API_KEY=your_openai_api_key_here

# Application Settings
UPLOAD_DIR=uploads
FRAMES_DIR=frames
MAX_FILE_SIZE_MB=500
FRAME_EXTRACTION_INTERVAL=2

# CORS Settings
FRONTEND_URL=http://localhost:3000

Start the backend server:

python main.py

The API will be available at http://localhost:8000

3. Frontend Setup

Navigate to the frontend directory:

cd apps/video-intelligence/frontend

Install dependencies:

npm install

Start the development server:

npm start

The app will open at http://localhost:3000

🎯 Usage

Upload Video:
- Drag and drop a video file or click to select
- Supported formats: MP4, AVI, MOV, WMV, FLV, WebM, MKV
- Maximum file size: 500MB
Monitor Processing:
- Watch real-time progress as the system extracts frames
- AI generates descriptions and embeddings for each frame
- Processing time depends on video length and complexity
Search Content:
- Choose your search type: Hybrid (recommended), Semantic, or Text
- Enter natural language queries in the search box
- Examples: "find frames with a person", "outdoor scenes", "blue objects"
- Results show thumbnail grid with normalized similarity scores (0-100%)
Navigate Video:
- Click on any frame thumbnail to jump to that timestamp
- Video player automatically seeks to the selected frame
- Use collapsible video playback section to save space
- View AI-generated descriptions for each frame
Manage Videos:
- Select from previously uploaded videos using the collapsible selector
- Delete videos you no longer need
- Upload multiple videos and switch between them seamlessly

🔧 API Endpoints

POST /upload - Upload video file
GET /ws/{video_id} - WebSocket for processing updates
POST /search - Search frames with natural language (supports hybrid, semantic, and text search)
GET /video/{video_id}/metadata - Get video metadata
DELETE /video/{video_id} - Delete video and associated data
GET /frames/{video_id}/{frame_name} - Serve frame images
GET /videos - Get list of all uploaded videos

🧪 Development

Backend Development

# Install development dependencies
pip install -r requirements.txt

# Run with auto-reload
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend Development

# Start development server with hot reload
npm start

# Build for production
npm run build

📊 Performance Optimization

Frame Extraction: Configurable interval (default: 2 seconds)
Batch Processing: AI operations processed in batches of 3-5 frames
Progressive Processing: Frames are processed and saved incrementally during upload
Vector Quantization: Multiple index types (scalar, binary, full-fidelity)
Thumbnail Generation: Optimized images for faster UI loading
WebSocket Progress: Real-time updates without polling
Score Normalization: Similarity scores normalized to 0-1 range for consistent display
Smooth UI Transitions: Collapsible sections with cubic-bezier animations

🚀 Deployment

Backend (Docker)

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Frontend (Netlify/Vercel)

npm run build
# Deploy the build/ directory

🔒 Security Considerations

File upload validation and size limits
API rate limiting
Environment variable protection
CORS configuration for production
MongoDB connection string security

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📝 License

This project is part of the MongoDB GenAI Showcase repository.

🆘 Troubleshooting

Common Issues

MongoDB Connection Failed
- Verify connection string format
- Check network access (whitelist IP in Atlas)
- Ensure database user has correct permissions
Video Processing Slow
- Reduce frame extraction interval
- Use smaller video files for testing
- Check API rate limits for OpenAI/Voyage
Search Returns No Results
- Verify both vector and text search indexes are created in Atlas
- Check if video processing completed successfully
- Try different search types (hybrid, semantic, text)
- Try more general search terms
WebSocket Connection Issues
- Ensure backend is running
- Check CORS settings
- Verify WebSocket URL format
Similarity Scores Over 100%
- This has been fixed with score normalization
- Scores now display as 0-100% range
- If still occurring, check hybrid search index configuration
422 Validation Errors During Search
- This has been resolved with improved request validation
- Ensure search_type is one of: "hybrid", "semantic", "text"
- Check request body format matches API expectations

Debug Mode

Enable detailed logging:

import logging
logging.basicConfig(level=logging.DEBUG)

📞 Support

For issues and questions:

Check existing GitHub issues
Create new issue with detailed description
Include logs and error messages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
backend		backend
docs		docs
frontend		frontend
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Unified Video Search.mmd		Unified Video Search.mmd
start-dev.sh		start-dev.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎬 Video Intelligence App

✨ Features

🏗️ Architecture

🚀 Quick Start

Prerequisites

1. MongoDB Atlas Setup

Vector Search Index

Text Search Index

2. Backend Setup

3. Frontend Setup

🎯 Usage

🔧 API Endpoints

🧪 Development

Backend Development

Frontend Development

📊 Performance Optimization

🚀 Deployment

Backend (Docker)

Frontend (Netlify/Vercel)

🔒 Security Considerations

🤝 Contributing

📝 License

🆘 Troubleshooting

Common Issues

Debug Mode

📞 Support

About

Uh oh!

Releases

Packages

Languages

License

kvprasannakumar05/video-intelligence

Folders and files

Latest commit

History

Repository files navigation

🎬 Video Intelligence App

✨ Features

🏗️ Architecture

🚀 Quick Start

Prerequisites

1. MongoDB Atlas Setup

Vector Search Index

Text Search Index

2. Backend Setup

3. Frontend Setup

🎯 Usage

🔧 API Endpoints

🧪 Development

Backend Development

Frontend Development

📊 Performance Optimization

🚀 Deployment

Backend (Docker)

Frontend (Netlify/Vercel)

🔒 Security Considerations

🤝 Contributing

📝 License

🆘 Troubleshooting

Common Issues

Debug Mode

📞 Support

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages