Out of cycle release today for Simple Chat
🐛 Fixes
- Scoping issue when selecting All in chat - Search included personal and public documents but was missing groups
- Video indexer logic improvements
- #527
- The API key did not work with paid service, only works with trial
- Removed API key authentication
- Updated config guidance to show how to setup managed identity permissions of the app service to the video indexer service
✨ Adds
New File Type Support
Added support for .xml, .yaml/.yml, .doc, .docm, and .log file types
Multi-Modal Vision Analysis for Images
Implemented comprehensive image upload support with AI-powered analysis:
Features:
- Base64 Conversion & Inline Display: Uploaded images are converted to base64 and displayed inline in the chat (like AI-generated images) instead of as file links
- Automatic Chunking: Large images (>1.5MB) are automatically split across multiple Cosmos DB documents to avoid the 2MB document limit, then seamlessly reassembled on retrieval
- Dual Text Extraction:
- Document Intelligence OCR: Extracts all visible text from the image
- GPT-4o Vision Analysis: Provides AI-generated description, object detection, contextual analysis, and text interpretation
- Info Button: User-uploaded images display an info button that reveals extracted text and vision analysis in a formatted, scrollable drawer
- Token-Efficient Chat History: Image context (OCR + Vision analysis) is included in the chat history as system messages so the AI can answer questions about uploaded images, but base64 image data is explicitly excluded to prevent token waste
- OCR + Vision analysis: ~625 tokens per image ✅
- Full base64 data: ~350K tokens per 1MB image ❌ (prevented)
- Settings Control: Multi-modal vision can be enabled/disabled in admin settings with model selection (GPT-4o, GPT-4o-mini, o-series, GPT-5, etc.)
Technical Implementation:
- Images stored with
role: 'image'andmetadata.is_user_upload: trueflag - Stores
extracted_text(OCR),vision_analysis(AI insights), andfilenamemetadata - Backend automatically includes image context in conversation history for AI reasoning
- Runtime safety checks prevent base64 data leakage into chat history
- Debug logging tracks image context addition and character counts
User Experience:
- Upload an image → Displays as thumbnail in chat
- Click info button → View formatted OCR text and AI vision analysis
- Ask questions about the image → AI uses extracted context to respond accurately
📦 Chunking Strategy for New File Types
Update README with chunking strategy · Issue #98 · microsoft/simplechat
DOC / DOCM
- Processed with Python package
docx2txt - Chunked by ~400 words, approximating an A4 page
XML
- Uses
RecursiveCharacterTextSplitterwith XML-aware separators - Structure-preserving chunking:
- Separators prioritized:
\n\n→\n→>(end of XML tags) → space → character - Splits at logical boundaries to maintain tag integrity
- Separators prioritized:
- Chunked by 4000 characters
- Goal: Preserve XML structure by splitting at tag boundaries rather than mid-element, ensuring chunks are more semantically meaningful for LLM processing
- See
process_xml
YAML / YML
- Uses
RecursiveCharacterTextSplitterwith YAML-aware separators - Structure-preserving chunking:
- Separators prioritized:
\n\n→\n→-(YAML list items) → space → character - Splits at logical boundaries to maintain YAML structure
- Separators prioritized:
- Chunked by 4000 characters
- Goal: Preserve YAML hierarchy and list structures by splitting at section boundaries and list items rather than mid-key or mid-value
- See
process_yaml
LOG
- Processed using line-based chunking to maintain log record integrity
- Never splits mid-line to preserve complete log entries
- Line-Level Chunking:
- Split file by lines using
splitlines(keepends=True)to preserve line endings - Accumulate complete lines until reaching target word count ≈1000 words
- When adding next line would exceed target AND chunk already has content:
- Finalize current chunk
- Start new chunk with current line
- If single line exceeds target, it gets its own chunk to prevent infinite loops
- Emit chunks with complete log records
- Split file by lines using
- Goal: Provide substantial log context (1000 words) while ensuring no log entry is split across chunks
- See
process_log
📝 Version Updates
- Incremented version to
0.229.098