Skip to content

feat: LLM-powered manuals assistant#6

Open
derekparent wants to merge 3 commits intomainfrom
feature/llm-manuals-assistant
Open

feat: LLM-powered manuals assistant#6
derekparent wants to merge 3 commits intomainfrom
feature/llm-manuals-assistant

Conversation

@derekparent
Copy link
Owner

Summary

  • Adds conversational troubleshooting assistant for CAT engine manuals using Claude Sonnet 4.5 + existing FTS5 RAG search
  • Streams responses via SSE with source citations to specific manual pages
  • Graceful degradation: falls back to FTS5 search results when API is unavailable

New Files

  • src/services/llm_service.py — Anthropic SDK wrapper (sync, retry, streaming, cost tracking)
  • src/services/chat_service.py — RAG pipeline orchestration, conversation history, token budget
  • src/prompts/manuals_assistant.py — System prompt, context formatting, message building
  • src/routes/chat.py — Chat endpoints with SSE streaming
  • templates/manuals/chat.html — Mobile-first chat UI (XSS-safe DOM rendering)
  • tests/test_chat.py — 21 unit + integration tests

Modified Files

  • src/config.py — Anthropic API config vars
  • src/models.py — ChatSession model for conversation persistence
  • src/services/manuals_service.pyget_context_for_llm() for RAG context retrieval
  • src/app.py — Blueprint registration + LLM service init
  • requirements.txt — anthropic SDK

Test plan

  • 21 tests passing (prompt building, LLM service, RAG context, routes, model)
  • Set ANTHROPIC_API_KEY and run flask db upgrade
  • Visit /manuals/chat and test with real queries
  • Verify SSE streaming works on mobile
  • Verify graceful fallback when API key is missing

🤖 Generated with Claude Code

Conversational troubleshooting assistant for CAT engine manuals using
Claude Sonnet 4.5 + existing FTS5 RAG search. Streams responses via
SSE with source citations. Graceful fallback to search when API unavailable.

New files:
- src/services/llm_service.py — Anthropic SDK wrapper (sync, retry)
- src/services/chat_service.py — RAG pipeline orchestration
- src/prompts/manuals_assistant.py — system prompt + context formatting
- src/routes/chat.py — chat endpoints with SSE streaming
- templates/manuals/chat.html — mobile-first chat UI
- tests/test_chat.py — 21 unit + integration tests

Modified:
- src/config.py — Anthropic API config vars
- src/models.py — ChatSession model
- src/services/manuals_service.py — get_context_for_llm()
- src/app.py — blueprint registration + LLM init
- requirements.txt — anthropic SDK

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b313fd3253

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +681 to +684
if (c.indexOf('csrf_token=') === 0) return c.substring(11);
}
return '';
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Inject a real CSRF token for chat POST requests

This request path is CSRF-protected, but the client code can only read a token from a meta[name="csrf-token"] tag or a csrf_token cookie, and this template does not render either value. In non-testing configs (WTF_CSRF_ENABLED=True), getCSRFToken() falls through to '', so /manuals/chat/api/message will be rejected with 400 and the chat UI cannot send messages.

Useful? React with 👍 / 👎.

Comment on lines +163 to +165
if not api_key:
logger.warning("ANTHROPIC_API_KEY not set — chat assistant disabled")
return None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reset singleton when chat service is disabled

When ANTHROPIC_API_KEY is missing, this branch returns None but leaves the module-level _service untouched. If the process previously initialized a valid client (e.g., prior app factory call), get_llm_service() will keep returning the stale instance, so chat remains enabled with old credentials instead of being gracefully disabled.

Useful? React with 👍 / 👎.

Replace duplicate FTS5 query in get_context_for_llm() with thin wrapper
around search_manuals(). Both search UI and LLM now share one search
path with equipment filters, authority boost, and tag-aware ranking.

- Rewrite system prompt for collaborative guide role (triage, not oracle)
- Add format_search_results() and format_page_content() context formats
- Add detect_equipment() auto-detection from query text
- Add equipment filter dropdown to chat UI
- Add get_pages_content() for future deep-dive phase
- Expand tests from 21 to 40 covering all new functions

Co-authored-by: Cursor <cursoragent@cursor.com>
@derekparent
Copy link
Owner Author

Search-Integrated Chat Redesign

Rewired the LLM chat to use the proven search_manuals() instead of its own duplicate FTS5 query. Both the search UI and LLM now share one search path.

Changes

  • get_context_for_llm() — Replaced 80-line duplicate FTS5 query with thin wrapper around search_manuals(). Gets equipment filters, authority boost, phrase boost, and tag-aware ranking for free.
  • System prompt — Rewritten for collaborative guide role: triage search results, suggest directions, let the engineer drive.
  • format_search_results() — Numbered list in <search_results> tags with snippet, authority, doc_type. LLM groups by topic naturally.
  • format_page_content() — Full page text in <page_content> tags for future deep-dive phase (built + tested, not yet wired).
  • detect_equipment() — Auto-detects 3516/C18/C32/C4.4 from query text. Dropdown wins if set, auto-detect is fallback.
  • Equipment dropdown — Added to chat UI, wired into fetch body.
  • Tests — 40 (up from 21): covers all new functions, verifies get_context_for_llm delegates to search_manuals().

Test validation

Query "valve lash" + equipment "3516" should now return identical top results in both search UI and chat assistant.

Conversational chat input (e.g., "What is the valve lash procedure
for the 3516?") returned 0 results because FTS5 implicit AND required
ALL words including stop words on one page. Added _extract_search_query()
that strips stop words and uses OR for >3 content words. BM25 ranking
naturally scores multi-match pages higher.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant