Skip to content

LLMOps Workshop - RAG Chatbot with Azure AI Foundry - Step-by-Step Hands-On Lab

Notifications You must be signed in to change notification settings

ritwickmicrosoft/llmops-workshop-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LLMOps Workshop - Azure AI Foundry

Azure AI Foundry License

End-to-end LLMOps workshop using Azure AI Foundry to build a RAG-enabled chatbot with vector search and RBAC authentication.

�️ Architecture

flowchart TB
    subgraph Client["πŸ–₯️ Client"]
        UI["Web Chat UI<br/>index.html"]
    end

    subgraph Backend["βš™οΈ Flask Backend"]
        APP["app.py<br/>RBAC Auth"]
    end

    subgraph Azure["☁️ Azure Cloud"]
        subgraph AIFoundry["Azure AI Foundry"]
            FOUNDRY["foundry-llmops-demo<br/>(AIServices)"]
            PROJECT["proj-llmops-demo"]
        end
        
        subgraph OpenAI["Azure OpenAI"]
            GPT["gpt-4o<br/>Chat Completion"]
            EMB["text-embedding-3-large<br/>Embeddings"]
            FILTER["Content Filters<br/>Hate/Sexual/Violence"]
        end
        
        subgraph Search["Azure AI Search"]
            INDEX["walle-products<br/>Vector Index"]
            DOCS["9 Documents<br/>txt, md, pdf"]
        end
    end

    subgraph LLMOps["πŸ“Š LLMOps Modules"]
        EVAL["02-evaluation/<br/>Groundedness, Fluency"]
        SAFETY["03-content-safety/<br/>Jailbreak Testing"]
    end

    subgraph DataFolder["πŸ“ data/"]
        TXT["*.txt<br/>Product Specs"]
        MD["*.md<br/>Policies"]
        PDF["*.pdf<br/>FAQs"]
    end

    UI -->|"1. User Question"| APP
    APP -->|"2. Embed Query"| EMB
    EMB -->|"3. Vector"| INDEX
    INDEX -->|"4. Top 3 Docs"| APP
    APP -->|"5. Context + Question"| GPT
    GPT -->|"6. Answer"| APP
    APP -->|"7. Response"| UI
    
    TXT --> INDEX
    MD --> INDEX
    PDF --> INDEX
    
    FOUNDRY -.->|"Connection"| OpenAI
    FOUNDRY -.->|"Connection"| Search
    
    EVAL -.->|"Quality Metrics"| GPT
    SAFETY -.->|"Test Filters"| FILTER
Loading

πŸ”„ RAG Flow Diagram

sequenceDiagram
    participant U as πŸ‘€ User
    participant F as 🌐 Flask App
    participant E as 🧠 Embeddings
    participant S as πŸ” AI Search
    participant G as πŸ’¬ GPT-4o

    U->>F: "What's the return policy?"
    F->>E: Generate embedding
    E-->>F: [0.123, -0.456, ...]
    F->>S: Vector search (top 3)
    S-->>F: Return Policy, Warranty, FAQ
    F->>G: System + Context + Question
    G-->>F: "Wall-E offers 30-day returns..."
    F-->>U: Formatted response
Loading

🎯 What You'll Learn

Build a complete RAG (Retrieval-Augmented Generation) chatbot for "Wall-E Electronics":

Module Topic Key Concepts Azure Services Difficulty
1 Environment Setup SDK auth, RBAC, workspace config Azure CLI, DefaultAzureCredential Beginner
2 Deploy Azure Infrastructure IaC, resource provisioning AI Foundry, OpenAI, AI Search Beginner
3 Create Vector Index Embeddings, vector search, chunking Azure OpenAI, AI Search Intermediate
4 Run RAG Chatbot Retrieval, prompt engineering, context Flask, GPT-4o, Vector Store Intermediate
5 Test & Explore Query testing, response quality Web UI, Azure Portal Beginner
6 Run Evaluation Groundedness, fluency metrics Azure AI Evaluation SDK Intermediate
7 Content Safety Jailbreak testing, content filters Azure OpenAI Content Filters Intermediate

Total Duration: ~120 minutes

🧠 Skills You'll Gain

  • βœ… Deploy Azure AI resources with RBAC (no API keys)
  • βœ… Create vector embeddings from documents (txt, md, pdf)
  • βœ… Build a semantic search index with Azure AI Search
  • βœ… Implement RAG pattern with GPT-4o
  • βœ… Build a production-ready chat interface
  • βœ… Evaluate RAG quality with groundedness & fluency metrics
  • βœ… Test content safety and prompt injection protection

πŸ” Authentication

This workshop uses RBAC (Role-Based Access Control) β€” no API keys required.

Your Azure CLI credentials are used automatically via DefaultAzureCredential:

  • Cognitive Services OpenAI User β€” Call Azure OpenAI APIs
  • Search Index Data Contributor β€” Read/write search indices
  • Search Service Contributor β€” Manage search service

πŸ“‹ Prerequisites

πŸš€ Quick Start

1. Clone and Setup

# Clone the repository
git clone https://github.com/ritwickmicrosoft/llmops-workshop-demo.git
cd llmops-workshop-demo

# Create Python virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

# Login to Azure
az login

2. Deploy Azure Resources

# Set variables
$env:AZURE_RESOURCE_GROUP = "rg-llmops-demo"
$env:AZURE_LOCATION = "eastus"

# Create resource group
az group create --name $env:AZURE_RESOURCE_GROUP --location $env:AZURE_LOCATION

# Create Azure OpenAI
az cognitiveservices account create `
  --name "aoai-llmops-demo" `
  --resource-group $env:AZURE_RESOURCE_GROUP `
  --location $env:AZURE_LOCATION `
  --kind OpenAI `
  --sku S0 `
  --custom-domain "aoai-llmops-demo"

# Deploy models
az cognitiveservices account deployment create `
  --name "aoai-llmops-demo" `
  --resource-group $env:AZURE_RESOURCE_GROUP `
  --deployment-name "gpt-4o" `
  --model-name "gpt-4o" `
  --model-version "2024-11-20" `
  --model-format OpenAI `
  --sku-capacity 10 `
  --sku-name Standard

az cognitiveservices account deployment create `
  --name "aoai-llmops-demo" `
  --resource-group $env:AZURE_RESOURCE_GROUP `
  --deployment-name "text-embedding-3-large" `
  --model-name "text-embedding-3-large" `
  --model-version "1" `
  --model-format OpenAI `
  --sku-capacity 10 `
  --sku-name Standard

# Create Azure AI Search
az search service create `
  --name "search-llmops-demo" `
  --resource-group $env:AZURE_RESOURCE_GROUP `
  --location $env:AZURE_LOCATION `
  --sku Basic

# Assign RBAC roles (wait 2-3 min after for propagation)
$myId = (az ad signed-in-user show --query id -o tsv)
az role assignment create --assignee $myId --role "Cognitive Services OpenAI User" `
  --scope $(az cognitiveservices account show --name aoai-llmops-demo --resource-group $env:AZURE_RESOURCE_GROUP --query id -o tsv)
az role assignment create --assignee $myId --role "Search Index Data Contributor" `
  --scope $(az search service show --name search-llmops-demo --resource-group $env:AZURE_RESOURCE_GROUP --query id -o tsv)

3. Create Vector Index

# Update .env with your resource endpoints
Copy-Item .env.example .env
# Edit .env with your endpoints

# Create search index with sample documents
cd 01-rag-chatbot
python create_search_index.py

4. Run Chatbot

cd ../04-frontend
python app.py
# Open http://localhost:5000

5. Open Workshop Playbook

Open LLMOps_Workshop_Playbook.html in your browser for detailed step-by-step instructions.

πŸ“ Project Structure

llmops-workshop/
β”œβ”€β”€ data/                           # Sample documents (txt, md, pdf)
β”‚   β”œβ”€β”€ laptop-pro-15.txt           # Product specs
β”‚   β”œβ”€β”€ smartwatch-x200.txt         # Product specs
β”‚   β”œβ”€β”€ nc500-headphones.txt        # Product specs
β”‚   β”œβ”€β”€ tablet-s10.txt              # Product specs
β”‚   β”œβ”€β”€ return-policy.md            # Policy document
β”‚   β”œβ”€β”€ warranty-policy.md          # Policy document
β”‚   β”œβ”€β”€ shipping-policy.md          # Policy document
β”‚   β”œβ”€β”€ troubleshooting-guide.md    # Support document
β”‚   └── faq.pdf                     # PDF document
β”œβ”€β”€ 01-rag-chatbot/                 # RAG Chatbot Module
β”‚   └── create_search_index.py      # Reads data/ folder, vectorizes, indexes
β”œβ”€β”€ 02-evaluation/                  # Evaluation Module
β”‚   β”œβ”€β”€ eval_dataset.jsonl          # Test dataset (Q&A pairs)
β”‚   β”œβ”€β”€ run_evaluation.py           # Run quality evaluation
β”‚   └── eval_results/               # Generated reports (HTML + JSON)
β”œβ”€β”€ 03-content-safety/              # Content Safety Module
β”‚   β”œβ”€β”€ content_filter_config.json  # Filter configuration
β”‚   β”œβ”€β”€ test_content_safety.py      # Test content filters
β”‚   └── test_results/               # Generated reports (HTML + JSON)
β”œβ”€β”€ 04-frontend/                    # Web Chat Interface
β”‚   β”œβ”€β”€ app.py                      # Flask backend (RBAC)
β”‚   β”œβ”€β”€ index.html                  # Dark-themed chat UI
β”‚   └── requirements.txt            # Frontend dependencies
β”œβ”€β”€ infra/                          # Infrastructure as Code
β”‚   β”œβ”€β”€ main.bicep                  # Main Bicep template
β”‚   └── modules/core.bicep          # Core resources
β”œβ”€β”€ .env.example                    # Environment template
β”œβ”€β”€ requirements.txt                # Python dependencies
β”œβ”€β”€ LLMOps_Workshop_Playbook.html   # Interactive step-by-step guide
└── README.md                       # This file

πŸ› οΈ Azure Resources

graph LR
    subgraph RG["Resource Group: rg-llmops-demo"]
        A["πŸ€– Azure AI Foundry<br/>foundry-llmops-demo"]
        B["🧠 Azure OpenAI<br/>aoai-llmops-eastus"]
        C["πŸ” Azure AI Search<br/>search-llmops-dev-*"]
    end
    
    A -->|"RBAC"| B
    A -->|"RBAC"| C
Loading
Resource Name Purpose
Azure AI Foundry foundry-llmops-demo Unified AI platform (AIServices)
Azure OpenAI aoai-llmops-eastus LLM (gpt-4o) + embeddings
Azure AI Search search-llmops-dev-* Vector store for RAG

πŸ“„ Sample Documents

The data/ folder contains 9 Wall-E Electronics documents in multiple formats:

Format Files Description
.txt 4 files Product specifications (Laptop, Watch, Headphones, Tablet)
.md 4 files Policies & support (Returns, Warranty, Shipping, Troubleshooting)
.pdf 1 file FAQ document

The create_search_index.py script automatically:

  1. Reads all files from data/ folder
  2. Extracts text from .txt, .md, and .pdf files
  3. Generates vector embeddings using Azure OpenAI
  4. Uploads to Azure AI Search with semantic and vector search

πŸ“Š Evaluation Metrics

The evaluation script (02-evaluation/run_evaluation.py) tests RAG quality using Azure AI Evaluation SDK:

Metrics (1-5 Scale)

Metric Description Target
Groundedness Is the response supported by the retrieved context? β‰₯4.0
Fluency Is the response grammatically correct and natural? β‰₯4.0

Scoring Standards

Score Rating Action
4.0-5.0 βœ“ Excellent Production-ready
3.0-4.0 ~ Good Minor improvements needed
2.0-3.0 ⚠ Needs Work Improve prompts or retrieval
1.0-2.0 βœ— Poor Major rework required

Sample Results

============================================================
  Evaluation Results
============================================================
  Aggregate Metrics:
  ----------------------------------------
  βœ— Groundedness    2.60/5.0
  ~ Fluency         3.00/5.0

  πŸ“Š Recommendations:
  - Consider improving groundedness: current score 2.60
  - Consider improving fluency: current score 3.00

Note: Low groundedness scores in demo are expected because the context field in eval_dataset.jsonl only contains document titles, not full text. In production with actual RAG retrieval, scores improve significantly.

Run Evaluation

$env:AZURE_OPENAI_ENDPOINT = "https://aoai-llmops-eastus.openai.azure.com/"
python 02-evaluation/run_evaluation.py

πŸ›‘οΈ Content Safety Testing

The content safety script (03-content-safety/test_content_safety.py) tests protection against harmful content and prompt injection.

Azure OpenAI Default Filters

Category Default Severity Description
Hate Speech Medium Blocked automatically
Sexual Content Medium Blocked automatically
Violence Medium Blocked automatically
Self-Harm Medium Blocked automatically
Jailbreak/Prompt Injection Not enabled Requires custom filter config

Test Categories

Category Tests Description
baseline 2 Normal product queries
prompt_injection 3 Jailbreak attempts (DAN, role-play)
boundary 3 Off-topic, competitor, PII requests

Sample Results

============================================================
  Content Safety Testing Complete!
============================================================
  Total Tests: 8
  βœ“ Passed: 8
  βœ— Failed: 0
  Pass Rate: 100.0%

  Filter Blocked: 0
  Model Refused: 8 (handled via system prompt)

Note: Jailbreak attempts are handled by the system prompt, not default content filters. The model correctly refuses malicious requests. For production, consider enabling Prompt Shields for additional protection.

Run Content Safety Tests

python 03-content-safety/test_content_safety.py

Generates HTML report in 03-content-safety/test_results/.

🧹 Cleanup

Delete all resources when done:

az group delete --name rg-llmops-demo --yes --no-wait

πŸ“š Resources

πŸ“„ License

MIT License


LLMOps Workshop β€” February 2026

About

LLMOps Workshop - RAG Chatbot with Azure AI Foundry - Step-by-Step Hands-On Lab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published