- Open
index.htmlin a browser or serve locally. - Use the terminal box:
ingest,diag,ctx, and questions.
- Commit/push the repo to GitHub and enable Pages for the repository.
- To reduce first‑load latency, keep only the library at
libs/webllm/and let the model load from CDN (or connect a server, see below).
- Edit
data/profile.mdwith your real bio/skills/projects/contacts. - In the terminal:
ingestto index and persist in the browser.
Run a small API that keeps a GGUF model in memory for fast replies.
- Install dependencies:
pip install -r server/requirements.txt - Download a GGUF (e.g., TinyLlama 1.1B Q4_K_M).
- Run:
MODEL_PATH=/path/to/model.gguf uvicorn server.app:app --host 0.0.0.0 --port 8000 - Open the site with
?server=http://localhost:8000.
- If you placed model files under
models/...using Git LFS, Pages serves pointer files and WebLLM fails. Either remove LFS for those files or let the loader use the CDN/remote model. - If the browser shows “library missing”, ensure
libs/webllm/index.jsexists in the repo or open with?online=1after forcing a hard refresh.