RAGlaw is a Retrieval-Augmented Generation (RAG) system designed to interpret and respond to questions based on the Polish Penal Code. It uses the GPT-4o Mini language model to generate contextually relevant answers, with information retrieved from a vector database. The solution is fully containerized and deployed on Azure Kubernetes Service (AKS) for scalability and ease of maintenance.
The 'rag-api' is API written in FastAPI that is serving as a backend service for the RAG communication. It consists of one POST Endpoint that retrieves the relevant context in Milvus DB, feeds it into LLM and returns the LLM answer based on that context
The 'frontend' is streamlit application that communicates with the rag-api and provide user-friendly interface to chat with RAG model
The 'ingesting' is Argo Workflows's workflow that chunks Polish's Penal Code, embedds it and stores the embeddings in Milvus Vector Database
Application is designed to run on kubernetes' pods
Azure Kubernetes Cluster (AKS) is deployed using terraform
-
Terraform Deployment
- Navigate to the
terraformdirectory. - Create terraform.tfvars file and assign values
resource_group_name = "law-rag-model-rg" resource_group_location = "West US" aks_name = "law-rag-model-aks" dns_prefix = "law-rag-model-dns" node_count = 1 vm_size = "Standard_B2s" os_disk_size_gb = 32 subscription_id = "<YOUR_SUBSCRIPTION_ID>"
- Initialize Terraform:
terraform init
- Apply the Terraform configuration to deploy the AKS cluster:
terraform apply
- Navigate to the
-
Application Deployment
- Navigate to the deploy directory.
- Use Helm to deploy the application:
helm install <release-name> ./helm-chart
- Alternatively, apply the Kubernetes YAML files:
kubectl apply -f sample.yaml -n namespace

