Skip to content

Lingo2 - A Whisper.cpp-based Speech-To-Text for Voice input into Linux

License

Notifications You must be signed in to change notification settings

Clay-Ferguson/lingo2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lingo 2.0 🗣️

Local speech-to-text powered by whisper.cpp. No cloud APIs, no costs, complete privacy.

This Lingo 2.0 project contains both a Web App and a GTK app, both of which use Whisper. The Web App is almost identical to the original Lingo, under this same Github account (by Clay Ferguson), except the oridinal Lingo uses browser-based Speech API (for Voice Input) rather than Whisper.

For browser-based Speech I do recomment Lingo, rather than Lingo 2.0, just because, if you're already in a browser, there's no reason to use Whisper.

GTK App Screenshot

Web App Screenshot

Projects

This mono-repo contains two applications that provide different ways to use whisper.cpp for voice input:

Project Description
web-app Browser-based TTS/STT with a FastAPI backend. Access via http://localhost:8009
gtk-app Linux desktop app for system-wide voice typing. Speaks into any focused application

Both projects share the same whisper.cpp engine located in whisper-model/.

🔧 First setup Whisper AI

setup-whisper.sh (project root)

One-time setup that:

  1. Clones the whisper.cpp repository
  2. Builds the whisper-cli binary using cmake
  3. Downloads the base.en model (~150MB)

🧠 Upgrading the AI Model

This project uses the base.en model by default, which offers a good balance of speed and accuracy. If you need better accuracy (at the cost of speed) or faster performance (at the cost of accuracy), you can switch to a different model.

Available Models

Model Size Speed Accuracy Best For
tiny.en ~75MB Fastest Decent Quick testing, low-powered devices
base.en ~150MB Fast Good Default - good balance
small.en ~500MB Medium Better Improved accuracy without too much slowdown
medium.en ~1.5GB Slower Great High accuracy needs
large ~3GB Slowest Best Maximum accuracy, multilingual

Note: The .en suffix means English-only models, which are smaller and faster. The large model is multilingual (no .en variant).

How to Switch Models

You need to edit two files:

1. setup-whisper.sh (line ~91)

Change the model name in the download command:

# Change from:
./models/download-ggml-model.sh base.en

# To (for example, small.en):
./models/download-ggml-model.sh small.en

2. Set Python Variable

Both the 'gtk-app' and the 'web-app' have this same variable definition which tells it which whisper model to use.

# Change from:
WHISPER_MODEL = WHISPER_DIR / "whisper.cpp" / "models" / "ggml-base.en.bin"

# To (for example, small.en):
WHISPER_MODEL = WHISPER_DIR / "whisper.cpp" / "models" / "ggml-small.en.bin"

3. Re-run whisper setup and restart

# Download the new model
./setup-whisper.sh

Next you can restart the app.

Tip: You can have multiple models downloaded. Just change whisper_server.py to switch between them without re-downloading.

Quick Start

  1. Build whisper.cpp and download the model:

    ./setup-whisper.sh
  2. Run whichever app you prefer:

    # Web app (browser-based)
    cd web-app && ./run.sh
    
    # GTK app (Linux desktop)
    cd gtk-app && ./run.sh

Requirements

  • Linux (Ubuntu/Debian tested) or macOS
  • Python 3 with venv support
  • ffmpeg
  • Build tools (cmake, git, build-essential)

See each project's README for additional dependencies.

About

Lingo2 - A Whisper.cpp-based Speech-To-Text for Voice input into Linux

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published