An automated tool for converting documents (PDF, Word, Excel, etc.) to Markdown.
- Batch file processing
- GPU acceleration (auto-detects CUDA)
- Detailed processing reports
- Outputs saved to
./documents
For a more automated setup, you can use the provided startup scripts. These scripts handle virtual environment creation, dependency installation, and application launch.
For Linux/macOS users:
- Make the script executable (if you haven't already done so, or if you downloaded it):
chmod +x start.sh
- Run the script:
./start.sh
For Windows users:
- Simply run the script by double-clicking
start.bator typing the following in your command prompt:start.bat
These scripts will:
- Check for Python 3.
- Create a Python virtual environment named
venvif it doesn't already exist. - Activate the virtual environment.
- Install (or update) all necessary dependencies from
requirements.txt. - Launch the
Docling-webui.pyapplication. - Attempt to open your default web browser to
http://localhost:7860.
Note: The application might sometimes start on a different port if 7860 is occupied. Please check the console output from the script for the exact URL (e.g., http://localhost:XXXX) and open it manually if the browser doesn't point to the correct address.
If you prefer to set up and run the application manually:
pip install -r requirements.txt(Ensure you are in a Python 3 environment or virtual environment before running this.)
python Docling-webui.pyDocuments: .pdf, .docx, .pptx, .xlsx
Text: .html, .md, .asciidoc
Images: .jpg, .png, .jpeg, .gif
Minimum: 4GB RAM + CPU
Recommended: NVIDIA GPU (CUDA 11+)
@techreport{Docling,
author = {Deep Search Team},
month = {8},
title = {Docling Technical Report},
url = {https://arxiv.org/abs/2408.09869},
eprint = {2408.09869},
doi = {10.48550/arXiv.2408.09869},
version = {1.0.0},
year = {2024}
}MIT License

