An intelligent movie recommendation system that analyzes movie trailers using OpenAI's CLIP model to find visually and thematically similar films. Features a modern React UI with fuzzy search and explainable AI recommendations.
- CLIP-based Trailer Analysis: Uses OpenAI's CLIP model to vectorize movie trailers
- Similarity Matching: Cosine similarity for finding related movies
- REST API: Flask-based API for frontend integration
- Efficient Processing: Frame sampling for fast trailer analysis
- Fuzzy Search: Advanced search using Fuse.js
- Trailer Display: Integrated video player with playback controls
- Explainable AI: Visual tags showing why each movie was recommended:
- Genre similarity
- Visual/cinematography matching
- Release year proximity
- Thematic similarity
- Similarity Scores: Percentage-based match scores
- Modern UI: Responsive design with dark theme
IMDB2.0/
├── frontend/ # React application
│ ├── src/
│ │ ├── components/ # React components
│ │ └── styles/ # CSS stylesheets
│ ├── package.json
│ └── README.md
├── backend_api.py # Flask API server
├── initial.ipynb # CLIP model exploration
├── requirements.txt # Python dependencies
└── README.md # This file
- Python 3.8+
- Node.js 16+
- npm or yarn
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Python dependencies:
pip install -r requirements.txt
-
Create directories for trailers:
mkdir -p trailers thumbnails
-
Add your movie trailers to the
trailers/directory
-
Navigate to frontend directory:
cd frontend -
Install dependencies:
npm install
python backend_api.pyThe API will be available at http://localhost:5000
In a new terminal:
cd frontend
npm run devThe UI will be available at http://localhost:3000
Returns all available movies
Returns recommendations for a specific movie with similarity scores and explanation tags
Serves trailer video files
Serves thumbnail images
Health check endpoint
- Trailer Vectorization: Each movie trailer is processed frame-by-frame using the CLIP model
- Feature Extraction: Frames are sampled (one per second) and converted to embedding vectors
- Vector Pooling: Frame embeddings are averaged to create a single vector per movie
- Similarity Calculation: Cosine similarity is computed between movie vectors
- Tag Generation: Recommendations are explained through multiple factors:
- Genre overlap
- Visual similarity from CLIP
- Temporal proximity
- Thematic matching
- Add the trailer video file (.mp4) to the trailers/ directory.
- Update movie_trailers.csv with the Title and YouTube Link.
- Run the Update Script (or the pipeline notebook). The system will:
- Detect the new file.
- Send it to Gemini for narrative extraction.
- Process visuals and audio locally.
- Add it to the ChromaDB trailer_db folder without re-processing existing movies.
{
"id": 4,
"title": "Movie Title",
"year": 2023,
"trailer": "movie_title.mp4",
"genres": ["Action", "Thriller"],
"description": "Movie description here"
}- Restart the backend server to vectorize the new trailer
- Flask: Web framework
- ChromaDB: Vector Database
- Google Gemini 2.5 Flash: google-generativeai to analyze videos
- PyTorch: Deep learning framework
- Transformers (Hugging Face): CLIP model
- OpenCV: Video processing
- scikit-learn: Similarity calculations
- Flask-CORS: Cross-origin resource sharing
- librosa: DSP Analysis
- moviepy: Decoding
- React 18: UI framework
- Vite: Build tool
- Fuse.js: Fuzzy search
- Lucide React: Icons
- CSS3: Styling
- Trailers are vectorized once at startup and cached in memory
- Frame sampling reduces processing time (1 frame per second)
- Frontend uses efficient fuzzy search with configurable thresholds
- Video playback is optimized with native HTML5 video
- Database integration (PostgreSQL/MongoDB)
- User accounts and personalized recommendations
- Movie ratings and reviews
- Advanced filtering (year, genre, rating)
- Recommendation history
- Social features (share recommendations)
- Audio analysis for soundtrack similarity
- NLP-based plot analysis for better theme matching
- Thumbnail generation from trailers
MIT
Contributions are welcome! Please feel free to submit a Pull Request.