A production-ready Text-to-Speech (TTS) SaaS platform featuring multi-voice generation, subscription management, and historical playback.
Vocalize is a full-stack SaaS application that transforms text into lifelike speech. It leverages the Kokoro TTS model for high-quality audio generation and provides a complete user management system with tiered subscriptions.
- High-Fidelity Audio: Powered by the Kokoro model served via a dedicated FastAPI microservice.
- Voice Selection: Choose from multiple voice profiles and accents.
- SaaS Architecture: Complete subscription flow with Stripe (Checkout, Webhooks, Portal).
- Usage Tracking: Credit/Quota system to limit usage based on subscription tier.
- Playback Control: Adjustable playback speed and immediate audio preview.
- History & Storage: Save generated audio files and revisit generation history.
- Secure Auth: OAuth integration with Google and GitHub.
The project is divided into two main components:
- Core App (Nuxt.js): Handles the frontend UI, authentication (OAuth), database interactions (Prisma), and billing logic (Stripe).
- Inference Engine (FastAPI): A lightweight Python service that runs the Kokoro model and serves audio data to the Core App.
- Frontend & Backend: Nuxt.js (Vue 3)
- Database: SQLite (via Prisma ORM)
- AI Model Serving: FastAPI (Python)
- TTS Model: Kokoro
- Payments: Stripe
- Styling: Tailwind CSS






