Convert LaTeX formula images from clipboard to LaTeX code in Logseq using various OCR providers like Hugging Face Transformers, Google Gemini, or a local Pix2Text server.
- Formula OCR: Convert images of LaTeX formulas into editable LaTeX code.
- Table OCR: Convert images of tables into Markdown tables.
- Multiple OCR Providers: Choose from several backends:
- Google Gemini: High-quality formula and table recognition.
- OpenAI Compatible: Connect to any OpenAI-compatible API (e.g., Local LLMs, Groq, OpenRouter).
- Pix2Text (Local): A private, offline-first OCR server.
- Hugging Face API: Cloud-based processing using the Nougat model.
- Docker (Self-hosted): Run the Nougat OCR model in a local Docker container.
/display-formula-ocr: Insert LaTeX code on a new line/inline-formula-ocr: Insert LaTeX code within a paragraph/table-ocr: Insert a Markdown table from an image. Currently works best with the Gemini provider.
Notes:
- The image in the clipboard must be a LaTex formula image
- Initial use may be slow due to model loading
- With the free Hugging Face plan you can make about 30k calls per month
- The Google Gemini API has a free tier with usage limits. Check the official pricing page for details.
-
Manual + Gemini (Recomended)
- Requirements: Google Gemini API Key
- Download the zip file from releases and unzip it.
- Enable developer mode:
Logseq > Settings > Advanced > Developer mode - Import Plugin:
Logseq > Plugins > Load unpacked pluginand point to the unzipped folder. - Go to plugin settings, select "Gemini" as the OCR Provider.
- Paste your Google Gemini API Key in the API Key setting field.
-
Manual + OpenAI Compatible
- Requirements: An OpenAI-compatible API (e.g., OpenAI, Groq, Local LLM)
- Download the zip file from releases and unzip it.
- Enable developer mode:
Logseq > Settings > Advanced > Developer mode - Import Plugin:
Logseq > Plugins > Load unpacked pluginand point to the unzipped folder. - Go to plugin settings, select "OpenAI Compatible" as the OCR Provider.
- Enter your API Key in the API Key field.
- Enter your API Endpoint in the API Endpoint field (e.g.,
https://api.openai.com/v1orhttp://localhost:11434/v1). - (Optional) Set the Model Name (default:
gpt-4o).
-
Manual + Pix2Text (Offline)
- Install Pix2Text Python package
- Start the server, eg.
p2t serve -l en -H 0.0.0.0 -p 8503 - Download the zip file from releases and unzip it.
- Enable developer mode:
Logseq > Settings > Advanced > Developer mode - Import Plugin:
Logseq > Plugins > Load unpacked pluginand point to the unzipped folder. - In the plugin settings, select "Local" as the OCR Provider and set the API Endpoint to the appropriate IP address and port (default is http://0.0.0.0:8503)
-
Manual + Hugging Face
- Requirements: Node.js, Yarn, Parcel, Hugging Face User Access Token
- Clone repo:
git clone https://github.com/olmobaldoni/logseq-formula-ocr-plugin.git - Install dependencies:
cd logseq-formula-ocr-plugin && yarn && yarn build - Enable developer mode:
Logseq > Settings > Advanced > Developer mode - Import Plugin:
Logseq > Plugins > Load unpacked pluginand point to the cloned repo
-
Marketplace + Hugging Face
- Requirements: Hugging Face User Access Token
- Search for
LaTeX Formula OCRin the Logseq marketplace and install directly
-
Marketplace + Docker
- Requirements: Docker
- Search for
LaTeX Formula OCRin the Logseq marketplace and install directly - Pull image:
docker pull olmobaldoni/nougat-ocr-api:latest - Run container:
docker run -d -p 80:80 olmobaldoni/nougat-ocr-api:latest
Note: For more information on how to use the other local API visit: https://github.com/olmobaldoni/LaTex-Formula-OCR-API
Hugging Face API may truncate responses (see Issuee #2 and Issue #487)
Note: Docker or Local(Pix2Text) method recommended for full functionality
This plugin is based on nougat-latex-base, a fine-tuning of facebook/nougat-base with im2latex-100k, and made by NormXU.
Pix2Text: Used for the local OCR server.
Google Gemini: Used as one of the OCR providers.
In addition, this plugin was also inspired by xxchan and its plugin logseq-ocr
MIT


