End-to-end LLMOps workshop using Azure AI Foundry to build a RAG-enabled chatbot with vector search and RBAC authentication.
flowchart TB
subgraph Client["π₯οΈ Client"]
UI["Web Chat UI<br/>index.html"]
end
subgraph Backend["βοΈ Flask Backend"]
APP["app.py<br/>RBAC Auth"]
end
subgraph Azure["βοΈ Azure Cloud"]
subgraph AIFoundry["Azure AI Foundry"]
FOUNDRY["foundry-llmops-demo<br/>(AIServices)"]
PROJECT["proj-llmops-demo"]
end
subgraph OpenAI["Azure OpenAI"]
GPT["gpt-4o<br/>Chat Completion"]
EMB["text-embedding-3-large<br/>Embeddings"]
FILTER["Content Filters<br/>Hate/Sexual/Violence"]
end
subgraph Search["Azure AI Search"]
INDEX["walle-products<br/>Vector Index"]
DOCS["9 Documents<br/>txt, md, pdf"]
end
end
subgraph LLMOps["π LLMOps Modules"]
EVAL["02-evaluation/<br/>Groundedness, Fluency"]
SAFETY["03-content-safety/<br/>Jailbreak Testing"]
end
subgraph DataFolder["π data/"]
TXT["*.txt<br/>Product Specs"]
MD["*.md<br/>Policies"]
PDF["*.pdf<br/>FAQs"]
end
UI -->|"1. User Question"| APP
APP -->|"2. Embed Query"| EMB
EMB -->|"3. Vector"| INDEX
INDEX -->|"4. Top 3 Docs"| APP
APP -->|"5. Context + Question"| GPT
GPT -->|"6. Answer"| APP
APP -->|"7. Response"| UI
TXT --> INDEX
MD --> INDEX
PDF --> INDEX
FOUNDRY -.->|"Connection"| OpenAI
FOUNDRY -.->|"Connection"| Search
EVAL -.->|"Quality Metrics"| GPT
SAFETY -.->|"Test Filters"| FILTER
sequenceDiagram
participant U as π€ User
participant F as π Flask App
participant E as π§ Embeddings
participant S as π AI Search
participant G as π¬ GPT-4o
U->>F: "What's the return policy?"
F->>E: Generate embedding
E-->>F: [0.123, -0.456, ...]
F->>S: Vector search (top 3)
S-->>F: Return Policy, Warranty, FAQ
F->>G: System + Context + Question
G-->>F: "Wall-E offers 30-day returns..."
F-->>U: Formatted response
Build a complete RAG (Retrieval-Augmented Generation) chatbot for "Wall-E Electronics":
| Module | Topic | Key Concepts | Azure Services | Difficulty |
|---|---|---|---|---|
| 1 | Environment Setup | SDK auth, RBAC, workspace config | Azure CLI, DefaultAzureCredential | Beginner |
| 2 | Deploy Azure Infrastructure | IaC, resource provisioning | AI Foundry, OpenAI, AI Search | Beginner |
| 3 | Create Vector Index | Embeddings, vector search, chunking | Azure OpenAI, AI Search | Intermediate |
| 4 | Run RAG Chatbot | Retrieval, prompt engineering, context | Flask, GPT-4o, Vector Store | Intermediate |
| 5 | Test & Explore | Query testing, response quality | Web UI, Azure Portal | Beginner |
| 6 | Run Evaluation | Groundedness, fluency metrics | Azure AI Evaluation SDK | Intermediate |
| 7 | Content Safety | Jailbreak testing, content filters | Azure OpenAI Content Filters | Intermediate |
Total Duration: ~120 minutes
- β Deploy Azure AI resources with RBAC (no API keys)
- β Create vector embeddings from documents (txt, md, pdf)
- β Build a semantic search index with Azure AI Search
- β Implement RAG pattern with GPT-4o
- β Build a production-ready chat interface
- β Evaluate RAG quality with groundedness & fluency metrics
- β Test content safety and prompt injection protection
This workshop uses RBAC (Role-Based Access Control) β no API keys required.
Your Azure CLI credentials are used automatically via DefaultAzureCredential:
Cognitive Services OpenAI Userβ Call Azure OpenAI APIsSearch Index Data Contributorβ Read/write search indicesSearch Service Contributorβ Manage search service
- Azure subscription with Contributor access
- Azure CLI v2.50+
- Python 3.10+
- VS Code with Python extension
# Clone the repository
git clone https://github.com/ritwickmicrosoft/llmops-workshop-demo.git
cd llmops-workshop-demo
# Create Python virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
# Login to Azure
az login# Set variables
$env:AZURE_RESOURCE_GROUP = "rg-llmops-demo"
$env:AZURE_LOCATION = "eastus"
# Create resource group
az group create --name $env:AZURE_RESOURCE_GROUP --location $env:AZURE_LOCATION
# Create Azure OpenAI
az cognitiveservices account create `
--name "aoai-llmops-demo" `
--resource-group $env:AZURE_RESOURCE_GROUP `
--location $env:AZURE_LOCATION `
--kind OpenAI `
--sku S0 `
--custom-domain "aoai-llmops-demo"
# Deploy models
az cognitiveservices account deployment create `
--name "aoai-llmops-demo" `
--resource-group $env:AZURE_RESOURCE_GROUP `
--deployment-name "gpt-4o" `
--model-name "gpt-4o" `
--model-version "2024-11-20" `
--model-format OpenAI `
--sku-capacity 10 `
--sku-name Standard
az cognitiveservices account deployment create `
--name "aoai-llmops-demo" `
--resource-group $env:AZURE_RESOURCE_GROUP `
--deployment-name "text-embedding-3-large" `
--model-name "text-embedding-3-large" `
--model-version "1" `
--model-format OpenAI `
--sku-capacity 10 `
--sku-name Standard
# Create Azure AI Search
az search service create `
--name "search-llmops-demo" `
--resource-group $env:AZURE_RESOURCE_GROUP `
--location $env:AZURE_LOCATION `
--sku Basic
# Assign RBAC roles (wait 2-3 min after for propagation)
$myId = (az ad signed-in-user show --query id -o tsv)
az role assignment create --assignee $myId --role "Cognitive Services OpenAI User" `
--scope $(az cognitiveservices account show --name aoai-llmops-demo --resource-group $env:AZURE_RESOURCE_GROUP --query id -o tsv)
az role assignment create --assignee $myId --role "Search Index Data Contributor" `
--scope $(az search service show --name search-llmops-demo --resource-group $env:AZURE_RESOURCE_GROUP --query id -o tsv)# Update .env with your resource endpoints
Copy-Item .env.example .env
# Edit .env with your endpoints
# Create search index with sample documents
cd 01-rag-chatbot
python create_search_index.pycd ../04-frontend
python app.py
# Open http://localhost:5000Open LLMOps_Workshop_Playbook.html in your browser for detailed step-by-step instructions.
llmops-workshop/
βββ data/ # Sample documents (txt, md, pdf)
β βββ laptop-pro-15.txt # Product specs
β βββ smartwatch-x200.txt # Product specs
β βββ nc500-headphones.txt # Product specs
β βββ tablet-s10.txt # Product specs
β βββ return-policy.md # Policy document
β βββ warranty-policy.md # Policy document
β βββ shipping-policy.md # Policy document
β βββ troubleshooting-guide.md # Support document
β βββ faq.pdf # PDF document
βββ 01-rag-chatbot/ # RAG Chatbot Module
β βββ create_search_index.py # Reads data/ folder, vectorizes, indexes
βββ 02-evaluation/ # Evaluation Module
β βββ eval_dataset.jsonl # Test dataset (Q&A pairs)
β βββ run_evaluation.py # Run quality evaluation
β βββ eval_results/ # Generated reports (HTML + JSON)
βββ 03-content-safety/ # Content Safety Module
β βββ content_filter_config.json # Filter configuration
β βββ test_content_safety.py # Test content filters
β βββ test_results/ # Generated reports (HTML + JSON)
βββ 04-frontend/ # Web Chat Interface
β βββ app.py # Flask backend (RBAC)
β βββ index.html # Dark-themed chat UI
β βββ requirements.txt # Frontend dependencies
βββ infra/ # Infrastructure as Code
β βββ main.bicep # Main Bicep template
β βββ modules/core.bicep # Core resources
βββ .env.example # Environment template
βββ requirements.txt # Python dependencies
βββ LLMOps_Workshop_Playbook.html # Interactive step-by-step guide
βββ README.md # This file
graph LR
subgraph RG["Resource Group: rg-llmops-demo"]
A["π€ Azure AI Foundry<br/>foundry-llmops-demo"]
B["π§ Azure OpenAI<br/>aoai-llmops-eastus"]
C["π Azure AI Search<br/>search-llmops-dev-*"]
end
A -->|"RBAC"| B
A -->|"RBAC"| C
| Resource | Name | Purpose |
|---|---|---|
| Azure AI Foundry | foundry-llmops-demo |
Unified AI platform (AIServices) |
| Azure OpenAI | aoai-llmops-eastus |
LLM (gpt-4o) + embeddings |
| Azure AI Search | search-llmops-dev-* |
Vector store for RAG |
The data/ folder contains 9 Wall-E Electronics documents in multiple formats:
| Format | Files | Description |
|---|---|---|
.txt |
4 files | Product specifications (Laptop, Watch, Headphones, Tablet) |
.md |
4 files | Policies & support (Returns, Warranty, Shipping, Troubleshooting) |
.pdf |
1 file | FAQ document |
The create_search_index.py script automatically:
- Reads all files from
data/folder - Extracts text from .txt, .md, and .pdf files
- Generates vector embeddings using Azure OpenAI
- Uploads to Azure AI Search with semantic and vector search
The evaluation script (02-evaluation/run_evaluation.py) tests RAG quality using Azure AI Evaluation SDK:
| Metric | Description | Target |
|---|---|---|
| Groundedness | Is the response supported by the retrieved context? | β₯4.0 |
| Fluency | Is the response grammatically correct and natural? | β₯4.0 |
| Score | Rating | Action |
|---|---|---|
| 4.0-5.0 | β Excellent | Production-ready |
| 3.0-4.0 | ~ Good | Minor improvements needed |
| 2.0-3.0 | β Needs Work | Improve prompts or retrieval |
| 1.0-2.0 | β Poor | Major rework required |
============================================================
Evaluation Results
============================================================
Aggregate Metrics:
----------------------------------------
β Groundedness 2.60/5.0
~ Fluency 3.00/5.0
π Recommendations:
- Consider improving groundedness: current score 2.60
- Consider improving fluency: current score 3.00
Note: Low groundedness scores in demo are expected because the
contextfield ineval_dataset.jsonlonly contains document titles, not full text. In production with actual RAG retrieval, scores improve significantly.
$env:AZURE_OPENAI_ENDPOINT = "https://aoai-llmops-eastus.openai.azure.com/"
python 02-evaluation/run_evaluation.pyThe content safety script (03-content-safety/test_content_safety.py) tests protection against harmful content and prompt injection.
| Category | Default Severity | Description |
|---|---|---|
| Hate Speech | Medium | Blocked automatically |
| Sexual Content | Medium | Blocked automatically |
| Violence | Medium | Blocked automatically |
| Self-Harm | Medium | Blocked automatically |
| Jailbreak/Prompt Injection | Not enabled | Requires custom filter config |
| Category | Tests | Description |
|---|---|---|
baseline |
2 | Normal product queries |
prompt_injection |
3 | Jailbreak attempts (DAN, role-play) |
boundary |
3 | Off-topic, competitor, PII requests |
============================================================
Content Safety Testing Complete!
============================================================
Total Tests: 8
β Passed: 8
β Failed: 0
Pass Rate: 100.0%
Filter Blocked: 0
Model Refused: 8 (handled via system prompt)
Note: Jailbreak attempts are handled by the system prompt, not default content filters. The model correctly refuses malicious requests. For production, consider enabling Prompt Shields for additional protection.
python 03-content-safety/test_content_safety.pyGenerates HTML report in 03-content-safety/test_results/.
Delete all resources when done:
az group delete --name rg-llmops-demo --yes --no-waitMIT License
LLMOps Workshop β February 2026