Skip to content

acramatte/trackit-ocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TrackitOcr - An Invoice Parser (Elixir + Mistral Document AI)

This project demonstrates how to use Mistral’s Document AI QnA API to extract structured data from uploaded documents — specifically invoices — using Elixir.

It serves as a learning and experimentation project to try out:

  • How to interact with Mistral’s Document AI API.
  • How to use structured output in chat completions.

🧰 Tech Stack


⚙️ How It Works

  1. File Upload

    The document (e.g. an invoice PDF) is first uploaded to Mistral’s AI blob store via their /files endpoint.

    This returns a document_url which can be referenced later.

  2. Chat Completion Request

    The document_url is then passed to the chat/completions API along with a short instruction message and a JSON schema defining the structured response format.

    Example schema fields:

    • amount_to_pay_cents
    • invoice_date
    • invoice_number
    • currency
    • reason_for_payment
    • issuer
  3. Structured Output

    The API returns a JSON object matching the schema


🚀 Running Locally

Prerequisites

  • Elixir ~> 1.18
  • mix toolchain
  • A valid Mistral API key

Setup

git clone <git@github.com:acramatte/trackit-ocr.git> cd trackit-ocr mix deps.get

Environment variables

Export your API key:

export MISTRAL_API_KEY="your_api_key_here"

Run

You can run the module or script directly via iex:

iex -S mix

Then call the function passing it the path to the file to upload:

Trackit.processPDF("../2025-11-04_My_Invoice_080000463123.pdf")

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages