PDF2Tutorial

A powerful, automated video generation platform designed to create educational tech tutorials from PDF slides. This project leverages AI for script refinement, high-quality Text-to-Speech (TTS), and programmatic video rendering.

Features

PDF to Presentation: Upload PDF slides and automatically extract them into a sequence of video scenes.
AI-Powered Scripting: Integrated with Google Gemini AI to transform fragmented slide notes into coherent, professional scripts.
High-Quality TTS: Supports local and cloud-based Text-to-Speech using Kokoro-js with customizable voices and quantization.
Rich Media Support: Insert MP4 videos and GIFs seamlessly between slides.
Programmatic Video Rendering: Built on Remotion, allowing for precise, frame-perfect video assembly and export.
Interactive Slide Editor: Drag-and-drop slide reordering, real-time audio generation, and script editing.
Background Music: Add and mix background tracks with custom volume controls.

Roadmap & TODO

YouTube Metadata Generator: Automatically generate optimized titles and descriptions using Gemini.
Thumbnail Generator: Create custom YouTube thumbnails based on slide content.
Voiceover Recording: Support for recording custom voiceovers directly within the app using a microphone.
Header Layout Optimization: Refactor and organize the application header for better aesthetics and usability.

Tech Stack

Frontend: React 19, Vite, Tailwind CSS (v4)
Video Engine: Remotion (v4)
AI: Google Gemini API (gemini-2.0-flash-lite)
TTS: Kokoro (FastAPI / Web Worker)
Backend: Express.js (serving as a rendering orchestration layer)
Utilities: Lucide React (icons), dnd-kit (drag & drop), pdfjs-dist (PDF processing)

Getting Started

Prerequisites

Node.js (v20+)
npm or yarn
FFmpeg (required by Remotion for rendering)

Installation

Clone the repository:

git clone https://github.com/techcow2/pdf2tutorial.git
cd pdf2tutorial

Install dependencies:
```
npm install
```
Start the development server (runs both Vite and the rendering server):
```
npm run dev
```

The application will be available at http://localhost:5173.

Configuration

Open the Settings Modal (Gear Icon) in the application to configure:

API Keys: Add your Google AI Studio API Key for script refinement.
TTS Settings: Choose between internal Web Worker TTS or a local Dockerized Kokoro FastAPI instance.
Audio Defaults: Set default voice models and quantization levels (q4/q8).

Project Structure

src/video/: Remotion compositions and video components.
src/components/: React UI components (Slide Editor, Modals, Uploaders).
src/services/: Core logic for AI, TTS, PDF processing, and local storage.
server.ts: Express server handling the @remotion/renderer logic.

Rendering

Videos are rendered server-side using Remotion. When you click "Download Video", the sequence is bundled and rendered via an Express endpoint, then served back as an MP4 file.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
screenshots		screenshots
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
render.mjs		render.mjs
server.ts		server.ts
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.server.json		tsconfig.server.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF2Tutorial

Features

Roadmap & TODO

Tech Stack

Getting Started

Prerequisites

Installation

Configuration

Project Structure

Rendering

License

About

Uh oh!

Releases

Packages

Languages

License

techcow2/pdf2tutorial

Folders and files

Latest commit

History

Repository files navigation

PDF2Tutorial

Features

Roadmap & TODO

Tech Stack

Getting Started

Prerequisites

Installation

Configuration

Project Structure

Rendering

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages