PyFacture is a Python project designed to automate expense management from receipts. The application utilizes image processing techniques and Optical Character Recognition (OCR) to extract relevant information from a photo of a receipt, such as purchased products, their prices, and the date of purchase.
- Image Processing: Enhances receipt images for better OCR accuracy.
- Optical Character Recognition (OCR): Extracts text from receipt images using Tesseract or Llama.
- Data Extraction: Analyzes OCR text to identify products, prices, and dates.
- Excel File Management: Creates and updates Excel files to store extracted data.
git clone https://github.com/Y1D1R/PyFacture.git
cd PyFactureInstall the required Python packages using pip:
pip install -r requirements.txtPyFacture relies on Tesseract OCR for text extraction.
Follow the instructions below based on your operating system.
Once you have Ollama installed, install the Llama 3.2-Vision model(6 GB):
ollama run llama3.2-visionMore information here : https://sebastian-petrus.medium.com/build-a-local-ollama-ocr-application-using-llama-3-2-vision-bfc3014e3ad6
Place your receipt images in the "data/input/" directory.
Ensure that the images are clear, well-lit, and free from distortions for optimal OCR results.
Execute the main script, then choose the method from the menu to process the receipts and extract data:
python pyfacture/main.py



