TTS_ka 🚀 Ultra-Fast Text-to-Speech

Ultra-Fast Text-to-Speech CLI tool with maximum speed generation, smart chunking, and parallel processing. Auto-optimized by default - no complex flags needed! Converts text to high-quality speech in Georgian (🇬🇪), Russian (🇷🇺), and English (🇬🇧) languages.

✨ Simplified UX: Auto-optimization is now enabled by default. Just specify --lang and go!

✨ Features

🚀 Ultra-Fast Generation: 6-15 seconds for 1000 words (vs 25+ seconds traditional)
🔊 Streaming Playback: Audio starts playing while still generating (NEW!)
🧠 Smart Chunking: Automatic text splitting for optimal performance
⚡ Parallel Processing: Multi-threaded generation with up to 8 workers
📋 Clipboard Integration: Direct clipboard-to-speech workflow
🎯 Auto-Optimization: Turbo mode automatically optimizes all settings
🎵 High-Quality Voices: Premium neural voices for all languages
📁 File Support: Process text files directly
🔄 Real-time Playback: Automatic audio playback with system player

🎯 Quick Start

1. Installation

# Install from PyPI (recommended)
pip install TTS_ka

# Or install from source
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .

2. Basic Usage (Auto-Optimized by Default)

# Ultra-fast generation with auto-optimization (default behavior)
python -m TTS_ka "Hello, how are you today?" --lang en

# Georgian text with automatic optimization
python -m TTS_ka "გამარჯობა, როგორ ხართ?" --lang ka

# Russian text with smart chunking
python -m TTS_ka "Привет, как дела?" --lang ru

3. Clipboard Workflow (FASTEST)

# Copy any text, then run (fastest workflow):
python -m TTS_ka clipboard --lang en

# For different languages:
python -m TTS_ka clipboard --lang ka  # Georgian
python -m TTS_ka clipboard --lang ru  # Russian

4. File Processing

# Process text files directly (auto-optimized)
python -m TTS_ka document.txt --lang en

# Long files with custom settings
python -m TTS_ka large_file.txt --chunk-seconds 30 --parallel 6 --lang ru

📖 Complete Usage Guide

Command Syntax

python -m TTS_ka [TEXT_SOURCE] [OPTIONS]

Text Sources

Direct text: "Your text here"
Clipboard: clipboard (copy text first)
File path: file.txt, document.md, etc.

Essential Options

Option	Description	Examples
`--lang`	Language: `ka` (Georgian), `ru` (Russian), `en` (English)	`--lang ka`
`--stream`	🆕 Enable streaming playback (audio starts while generating)	`--stream`
`--chunk-seconds`	Chunk size in seconds (0=auto, 20-60 optimal)	`--chunk-seconds 30`
`--parallel`	Workers (0=auto, 2-8 recommended)	`--parallel 6`
`--no-play`	Skip automatic audio playback	`--no-play`
`--no-turbo`	Disable auto-optimization (legacy mode)	`--no-turbo`
`--help-full`	Show comprehensive help with examples	`--help-full`

🏃‍♂️ Performance Examples

Speed Comparison (1000 words)

Traditional TTS: 25-40 seconds
TTS_ka Direct: 15-25 seconds
TTS_ka Turbo: 8-15 seconds
TTS_ka Chunked: 6-12 seconds ⚡
TTS_ka Streaming: 🔊 2-3 seconds to first audio (NEW!)

🆕 Streaming Playback - Audio Starts Immediately!

The new streaming feature starts playing audio within 2-3 seconds while the rest continues generating in the background. This provides an 85-90% reduction in perceived wait time!

Quick Usage:

# Basic streaming - audio starts almost instantly!
python -m TTS_ka "Your long text..." --lang en --stream

# From file with streaming
python -m TTS_ka article.txt --lang ka --stream

# Clipboard with streaming (fastest workflow)
python -m TTS_ka clipboard --stream

How It Works:

Text is split into chunks (if needed)
Chunks generate in parallel (2-8 workers)
First chunk plays immediately (~2-3 seconds)
Remaining chunks continue generating in background
Final merged audio file is saved

Performance:

Without streaming: Wait 10-30+ seconds for all audio
With streaming: Hear audio in 2-3 seconds ⚡
Platform support: Windows, Linux, macOS

Advanced Streaming:

# Custom chunking for optimal streaming
python -m TTS_ka longtext.txt --stream --chunk-seconds 25 --parallel 6

# Streaming without final playback
python -m TTS_ka text.txt --stream --no-play

Real-World Examples

# 1. Quick phrases (instant generation)
python -m TTS_ka "Thank you very much!" --lang en
# ⚡ Completed in 2.3s (optimized)

# 2. Medium text (paragraph)
python -m TTS_ka "Lorem ipsum dolor sit amet..." --lang en  
# ⚡ Completed in 5.7s (direct)

# 3. Long document (chunked processing)
python -m TTS_ka large_document.txt --lang en
# Strategy: chunked generation, 6 workers
# ⚡ Completed in 12.4s (chunked)

# 4. Clipboard workflow (daily usage)
python -m TTS_ka clipboard --lang ka
# OPTIMIZED MODE - Georgian
# Processing: 45 words, 287 characters
# ⚡ Completed in 4.1s

🌍 Language Support

Language	Code	Voice Quality	Speed	Example
Georgian 🇬🇪	`ka`	Premium Neural	Fast	`--lang ka`
Russian 🇷🇺	`ru`	High Quality	Very Fast	`--lang ru`
English 🇬🇧	`en`	Premium Neural	Maximum	`--lang en`

Voice Details

Georgian: ka-GE-EkaNeural - Premium female voice
Russian: ru-RU-SvetlanaNeural - High-quality female voice
English: en-GB-SoniaNeural - British English neural voice

⚙️ Advanced Usage

Custom Optimization

# Manual chunking for very long texts
python -m TTS_ka book_chapter.txt --chunk-seconds 45 --parallel 4 --lang en

# Maximum parallelization (for powerful systems)
python -m TTS_ka large_text.txt --parallel 8 --lang ru

# Batch processing (no audio playback)  
python -m TTS_ka document.txt --no-play --lang ka

# Legacy mode (disable auto-optimization)
python -m TTS_ka "text" --no-turbo --lang en

Workflow Integration

# Create alias for daily use
alias speak='python -m TTS_ka clipboard --lang en'

# Windows batch file (speak.bat)
@echo off
python -m TTS_ka clipboard --lang en

# Read web articles (with browser copy)
# 1. Copy article text
# 2. Run: python -m TTS_ka clipboard --lang en

🔧 Installation & Requirements

System Requirements

Python: 3.6+ (3.8+ recommended)
OS: Windows, macOS, Linux
Memory: 256MB+ available RAM
Network: Internet connection for voice synthesis

Dependencies

Required:

pip install edge-tts>=6.1.9        # Core TTS engine
pip install pydub>=0.25.1          # Audio processing  
pip install tqdm>=4.65.0           # Progress bars
pip install pyperclip>=1.8.2       # Clipboard support

System Requirements:

FFmpeg: Required for audio processing
- Windows: Download from ffmpeg.org
- macOS: brew install ffmpeg
- Ubuntu: sudo apt install ffmpeg

Complete Installation

# Method 1: PyPI installation (simplest)
pip install TTS_ka

# Method 2: Development installation
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .

# Method 3: Manual dependencies
pip install edge-tts pydub tqdm pyperclip

# Verify installation
python -m TTS_ka "Installation successful!" --turbo --lang en

🎮 AutoHotkey Integration (Windows)

Quick Setup

Install AutoHotkey v2
Create tts_hotkeys.ahk:

; Ultra-fast TTS hotkeys
!e::  ; Alt+E - English
{
    Run("cmd /k python -m TTS_ka clipboard --lang en")
}

!r::  ; Alt+R - Russian  
{
    Run("cmd /k python -m TTS_ka clipboard --lang ru")
}

!x::  ; Alt+X - Georgian
{
    Run("cmd /k python -m TTS_ka clipboard --lang ka")
}

Double-click to run, then:
- Copy text → Alt+E for English
- Copy text → Alt+R for Russian
- Copy text → Alt+X for Georgian

Daily Workflow

Browse web → Copy interesting text
Press Alt+E → Instant speech
Continue browsing while listening

🔍 Troubleshooting

Common Issues

1. "No module named 'edge_tts'"

pip install edge-tts>=6.1.9

2. "FFmpeg not found"

# Windows: Download and add to PATH
# macOS: brew install ffmpeg  
# Linux: sudo apt install ffmpeg

3. Slow generation

# Auto-optimization is enabled by default
python -m TTS_ka "text" --lang en

# Reduce parallel workers if network issues
python -m TTS_ka "text" --parallel 2 --lang en

# Use legacy mode only if needed
python -m TTS_ka "text" --no-turbo --lang en

4. Empty clipboard

# Ensure text is copied first
# Then run: python -m TTS_ka clipboard --turbo --lang en

Performance Optimization

For Maximum Speed:

# Use these exact settings for best performance (auto-optimized by default)
python -m TTS_ka clipboard --chunk-seconds 30 --parallel 6 --lang en

For System with Limited Resources:

# Reduce workers and chunk size
python -m TTS_ka text --parallel 2 --chunk-seconds 60 --lang en

📊 Performance Benchmarks

Text Length vs Generation Time

Words	Direct Mode	Turbo Mode	Chunked (6 workers)
10-50	2-4s	1-3s	2-4s
100-300	8-12s	5-8s	4-6s
500-1000	18-25s	12-15s	8-12s
1000+	30-45s	18-25s	10-18s

Optimal Settings by Text Length

# Short text (< 100 words): Direct generation (auto-optimized)
python -m TTS_ka "short text" --lang en

# Medium text (100-500 words): Auto-optimized mode
python -m TTS_ka medium_text.txt --lang en  

# Long text (500+ words): Chunked processing (auto-detected)
python -m TTS_ka long_text.txt --chunk-seconds 30 --parallel 6 --lang en

🚀 Examples & Use Cases

Daily Workflows

1. Article Reading

# Copy web article → instant speech
python -m TTS_ka clipboard --lang en

2. Document Processing

# Process research papers, books, etc.
python -m TTS_ka research_paper.pdf.txt --lang en

3. Language Learning

# Practice pronunciation with different languages
python -m TTS_ka "სწავლობდი ქართულს" --lang ka
python -m TTS_ka "Learning Russian язык" --lang ru

4. Accessibility

# Screen reader alternative
python -m TTS_ka clipboard --no-play --lang en > audio_file.mp3

Batch Processing

# Process multiple files
for file in *.txt; do
    python -m TTS_ka "$file" --no-play --lang en
done

# Windows batch processing
for %f in (*.txt) do python -m TTS_ka "%f" --no-play --lang en

🛠️ Advanced Configuration

Environment Variables

# Set default language
export TTS_DEFAULT_LANG=ka

# Set default mode  
export TTS_DEFAULT_MODE=turbo

# Custom output directory
export TTS_OUTPUT_DIR=/path/to/audio/files

Configuration File

Create ~/.tts_config.json:

{
    "default_lang": "en",
    "turbo_mode": true,
    "chunk_seconds": 30,
    "parallel_workers": 6,
    "auto_play": true
}

🔌 API Integration

Python Script Integration

#!/usr/bin/env python3
import subprocess
import sys

def text_to_speech(text, lang="en", turbo=True):
    """Convert text to speech using TTS_ka"""
    cmd = [
        "python", "-m", "TTS_ka", 
        text, 
        "--lang", lang
    ]
    if turbo:
        cmd.append("--turbo")
    
    subprocess.run(cmd)

# Usage
text_to_speech("Hello world!", "en")
text_to_speech("გამარჯობა!", "ka")

Web Integration

# URL to speech (with curl + TTS_ka)
curl -s "https://example.com/article" | \
python -m TTS_ka /dev/stdin --turbo --lang en

📱 Mobile & Remote Usage

SSH/Remote Usage

# Generate audio on remote server
ssh user@server "python -m TTS_ka 'Remote generation' --turbo --no-play"

# Download and play locally
scp user@server:data.mp3 ./remote_audio.mp3

Docker Usage

FROM python:3.9
RUN pip install TTS_ka
RUN apt-get update && apt-get install -y ffmpeg
ENTRYPOINT ["python", "-m", "TTS_ka"]

# Docker usage
docker run tts_container "Hello Docker!" --turbo --lang en

🎯 Tips & Best Practices

Performance Tips

Auto-optimization is enabled by default - no flags needed!
Use clipboard workflow for fastest daily usage
Chunk long texts with --chunk-seconds 30
Optimize workers with --parallel 4-6 for most systems
Pre-install FFmpeg for best audio processing

Quality Tips

Georgian text: Use --lang ka for best quality
Mixed languages: Process separately for optimal results
Technical text: Use shorter chunks (--chunk-seconds 20)
Clean input: Remove extra whitespace and formatting

Workflow Tips

Create aliases for frequent commands
Use hotkeys (AutoHotkey on Windows)
Batch process large document collections
Test settings with small text first

📄 File Format Support

Supported Input Formats

Text files: .txt, .md, .rst
Code files: .py, .js, .html (extracts text)
Clipboard: Any copied text
Direct input: Command-line strings

Output Format

Audio: MP3 (high quality, compressed)
Bitrate: 128kbps (optimal size/quality balance)
Sample Rate: 24kHz (neural voice quality)

🔄 Updates & Maintenance

Keeping Updated

# Update to latest version
pip install --upgrade TTS_ka

# Check current version  
python -m TTS_ka --version

# Update dependencies
pip install --upgrade edge-tts pydub tqdm pyperclip

Health Check

# Test installation
python -m TTS_ka "System check" --turbo --lang en

# Verify FFmpeg  
ffmpeg -version

# Check Python version
python --version  # Should be 3.6+

🤝 Contributing

We welcome contributions! See our GitHub repository for:

Bug reports and feature requests
Code contributions and pull requests
Documentation improvements
Language support additions

Development Setup

git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e ".[dev]"
pytest  # Run tests

📞 Support

Getting Help

Documentation: Use --help-full for comprehensive help
Issues: Report bugs on GitHub Issues
Discussions: Join GitHub Discussions

Quick Diagnostics

# Check system compatibility  
python -m TTS_ka --help-full

# Test with minimal command
python -m TTS_ka "test" --turbo --lang en

# Verify FFmpeg installation
ffmpeg -version

📜 License & Credits

License: MIT License - see LICENSE file

Credits:

Edge-TTS: Microsoft's edge-tts library for voice synthesis
PyDub: Audio processing and manipulation
FFmpeg: Audio encoding and format conversion

Author: David Chincharashvili (davidchincharashvili@gmail.com)

⭐ Star this project on GitHub if you find it useful!
🐛 Report issues to help improve the tool
🤝 Contribute to make it even better

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.idea		.idea
.tts_fast_cache		.tts_fast_cache
__pycache__		__pycache__
src		src
tests		tests
.bumpversion.cfg		.bumpversion.cfg
.coverage		.coverage
.gitignore		.gitignore
.streaming_playlist.m3u		.streaming_playlist.m3u
COVERAGE_REPORT.md		COVERAGE_REPORT.md
LICENSE		LICENSE
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
read.ahk		read.ahk
read_example.ahk		read_example.ahk
readme.md		readme.md
requirements-test.txt		requirements-test.txt
shortcuts_example.py		shortcuts_example.py
test_long_streaming.txt		test_long_streaming.txt
test_streaming.py		test_streaming.py
upgrade.sh		upgrade.sh

License

DavidTbilisi/TTS

Folders and files

Latest commit

History

Repository files navigation