Cloud-Agnostic Auto-Scaling DuckDB Query Platform
Tavana is a production-ready, auto-scaling SQL query engine for data lakes. It combines DuckDB's blazing-fast analytics performance with Kubernetes' orchestration capabilities to provide a serverless-like experience for querying petabytes of data stored in S3, Azure Data Lake, or Google Cloud Storage.
Key Innovation: Adaptive auto-scaling that combines both horizontal (more pods) and vertical (bigger pods) scaling, with intelligent query queuing that never rejects queries and proactively signals for capacity expansion.
Traditional data lake query engines face a classic dilemma:
- Over-provision resources β Waste money on idle infrastructure
- Under-provision resources β Queries fail or timeout under load
Tavana solves this with:
- Adaptive Scaling: Automatically scales workers based on actual query workload (not hardcoded thresholds)
- Smart Queuing: FIFO queue that understands Kubernetes capacity and signals for scale-up proactively
- Resource Awareness: Detects cluster limits and operates in two modes (Scaling vs Saturation)
- Zero Rejections: Queries never get rejected; they wait in queue until resources are available
- β PostgreSQL Wire Protocol - Works with any PostgreSQL client (psql, DBeaver, Tableau, Python, JDBC)
- β SSL/TLS Support - Optional encrypted connections with self-signed or custom certificates
- β DuckDB Engine - 10-100x faster than traditional engines for analytical queries
- β S3-Compatible Storage - Query Parquet/CSV/JSON files from S3, ADLS Gen2, GCS, MinIO
- β
Delta Lake Support - Native
delta_scan()for Delta Lake tables with Azure Workload Identity - β Streaming Results - Memory-efficient row-by-row streaming for large result sets
- β True Streaming Cursors - Server-side cursors with O(n) performance (no re-scanning on FETCH)
- β Horizontal Pod Autoscaler (HPA) - Scales worker count from 1-20 based on queue depth and wait times
- β Vertical Pod Autoscaler (VPA) - Dynamically resizes worker memory (512MB-12GB) based on query needs
- β Pre-Assignment Sizing - Sizes workers BEFORE query execution to prevent OOM
- β Elastic Growth - Grows worker memory DURING execution if utilization is high
- β Smart Shrink - Reduces idle worker memory back to minimum
- β FIFO Fairness - First-in, first-out query execution
- β Capacity-Aware Admission - Checks available memory across all workers
- β Proactive HPA Signaling - Signals Kubernetes to add workers when queue grows
- β Resource Ceiling Detection - Knows when cluster is at max capacity
- β Never Rejects - Blocks caller until capacity is available (no 429 errors)
- β GitOps Ready - Terraform + Helm + ArgoCD deployment
- β Multi-Cloud - Azure, AWS, GCP, on-premise Kubernetes
- β Observable - Prometheus metrics + Grafana dashboards
- β Secure - Pod security policies, network policies, workload identity
- β CI/CD - Automated builds, tests, security scans, image signing
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLIENT LAYER β
β Tableau / PowerBI / DBeaver / psql / Python / JDBC (PostgreSQL Protocol) β
ββββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β TCP 5432 (PostgreSQL Wire Protocol)
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β KUBERNETES CLUSTER β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β TAVANA NAMESPACE β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β GATEWAY (2+ replicas, HA) β β β
β β β β β β
β β β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββ β β β
β β β β PG Wire ββ β QueryQueue ββ β SmartScaler β β β β
β β β β Handler β β (FIFO Queue) β β (HPA + VPA) β β β β
β β β β β β β β β β β β
β β β β β’ Auth β β β’ Admission β β β’ Scale workersβ β β β
β β β β β’ Parse SQL β β β’ Capacity check β β β’ Resize memoryβ β β β
β β β β β’ Route queries β β β’ FIFO dispatch β β β’ K8s API β β β β
β β β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββ β β β
β β β β β β β
β β β β gRPC (ExecuteQuery) β β β
β β ββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ β β
β β β β β
β β ββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ β β
β β β WORKERS (2-20 pods, auto-scaled by HPA) β β β
β β β β β β
β β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β β β
β β β β Worker 1 β β Worker 2 β ... β Worker N β β β β
β β β β β β β β β β β β
β β β β β’ DuckDB β β β’ DuckDB β β β’ DuckDB β β β β
β β β β β’ S3 Reader β β β’ S3 Reader β β β’ S3 Reader β β β β
β β β β β’ Extensions β β β’ Extensions β β β’ Extensions β β β β
β β β β β’ Streaming β β β’ Streaming β β β’ Streaming β β β β
β β β β β β β β β β β β
β β β β Memory: β β Memory: β β Memory: β β β β
β β β β 512MB-12GB β β 512MB-12GB β β 512MB-12GB β β β β
β β β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β β β
β β β β β β β β β
β β βββββββββββΌβββββββββββββββββββΌβββββββββββββββββββββββββΌβββββββββββββββ β β
β β β β β β β
β ββββββββββββββΌβββββββββββββββββββΌβββββββββββββββββββββββββΌβββββββββββββββββ β
β β β β β
β β S3 API (HTTPS) β β β
ββββββββββββββββΌβββββββββββββββββββΌβββββββββββββββββββββββββΌβββββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OBJECT STORAGE (S3-compatible) β
β β
β AWS S3 / Azure ADLS Gen2 / Google GCS / MinIO β
β β
β β’ Parquet files β
β β’ CSV files β
β β’ JSON files β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The Gateway is the query entry point and orchestration layer.
Responsibilities:
- PostgreSQL Wire Protocol: Accepts connections from any PostgreSQL client
- Query Routing: Analyzes SQL and estimates resource requirements
- Queue Management: Manages the FIFO query queue with capacity awareness
- Worker Pool: Tracks worker availability and health
- Auto-Scaling: Triggers HPA (add workers) and VPA (resize workers)
- Metrics & Observability: Exposes Prometheus metrics
Key Modules:
pg_wire.rs: PostgreSQL wire protocol implementationcursors.rs: Server-side cursor management (DECLARE/FETCH/CLOSE)query_queue.rs: Smart FIFO queue with capacity awarenesssmart_scaler.rs: HPA + VPA scaling logicworker_pool.rs: Worker discovery and load balancingworker_client.rs: gRPC client for worker communication (including cursor APIs)data_sizer.rs: Query cost estimation (S3 HEAD requests)
Technology:
- Language: Rust (for performance and safety)
- Concurrency: Tokio async runtime (handles thousands of concurrent connections)
- Protocol: PostgreSQL wire protocol v3.0
- gRPC: Communicates with workers via gRPC
Workers execute DuckDB queries and stream results back to clients.
Responsibilities:
- Query Execution: Runs DuckDB queries on Parquet/CSV/JSON files
- S3 Integration: Reads data from object storage via DuckDB's httpfs extension
- Result Streaming: Streams results row-by-row to prevent memory exhaustion
- Extension Management: Pre-loads DuckDB extensions (httpfs, parquet)
- Resource Reporting: Reports memory/CPU usage to Gateway
Key Features:
- In-Memory DuckDB: Each worker runs an in-memory DuckDB instance
- Extension Pre-Installation: DuckDB extensions are pre-downloaded at build time
- True Streaming Cursors: Holds DuckDB Arrow iterators for server-side cursor support
- Memory Safety: Uses Rust's ownership model to prevent memory leaks
- Graceful Shutdown: Completes running queries before termination
Technology:
- Language: Rust with DuckDB C API bindings
- Database: DuckDB (in-memory, columnar, OLAP-optimized)
- Storage: Reads from S3 via DuckDB's httpfs extension
- gRPC Server: Receives query requests from Gateway
- Cursor Manager: Manages server-side cursors with idle cleanup
The QueryQueue is the brain of admission control and capacity management.
How It Works:
- Query Arrives: Client sends SQL via PostgreSQL protocol
- Cost Estimation: Gateway estimates memory needed (via S3 HEAD requests on Parquet files)
- Capacity Check: Check total available memory across all workers
Available = Ξ£ (worker.memory_limit - worker.memory_used) - Admission Decision:
- If capacity available β Dispatch query immediately to a worker
- If no capacity:
- Below resource ceiling β Signal HPA to add workers, queue query
- At resource ceiling β Queue query, return estimated wait time
- FIFO Dispatch: As workers complete queries and free capacity, dispatch queued queries in order
Key Behaviors:
- Never Rejects: Queries wait in queue until capacity is available (no 429 errors)
- Proactive Scaling: Signals HPA when queue depth > 0 (don't wait for timeout)
- Fair Scheduling: First-in, first-out (no query starvation)
- Capacity-Aware: Understands total cluster capacity and current utilization
Metrics Exposed:
tavana_query_queue_depth: Number of queries waitingtavana_query_queue_wait_seconds: Time queries wait before executiontavana_available_capacity_mb: Total unused memory across workerstavana_resource_ceiling_mb: Maximum available cluster memorytavana_operation_mode: Current mode (0=Scaling, 1=Saturation)
The SmartScaler orchestrates both horizontal and vertical scaling.
Horizontal Pod Autoscaler (HPA):
Scales the number of worker pods based on:
- Queue Depth: If
queue_depth > 0, scale up - Wait Time: If
avg_wait_time > 30s, scale up - Worker Utilization: If
cpu > 70%ormemory > 80%, scale up - Idle Time: If
queue_depth == 0for 5+ minutes, scale down
Vertical Pod Autoscaler (VPA):
Dynamically resizes worker memory:
-
Pre-Assignment Sizing: Before dispatching a query, resize worker to fit estimated data size
new_memory = max(estimated_data_size * 1.5, current_memory) -
Elastic Growth: During execution, if
memory_used > 80% of limit, grow memorynew_memory = min(current_memory * 1.5, 12GB) -
Smart Shrink: After query completes, if worker idle for 10+ minutes, shrink to minimum
new_memory = 512MB # Minimum worker size
Kubernetes Integration:
- Uses K8s API to scale Deployments (HPA)
- Uses K8s API to update Pod resources (VPA, requires K8s 1.27+)
- Monitors Node allocatable memory (resource ceiling detection)
Scaling Policies:
Scale Up:
β’ Trigger: queue_depth > 0 OR wait_time > 30s OR utilization > 70%
β’ Cooldown: 60 seconds
β’ Max increment: 4 pods at once
Scale Down:
β’ Trigger: queue_depth == 0 AND idle_time > 5min AND utilization < 30%
β’ Cooldown: 300 seconds (5 minutes)
β’ Max decrement: 50% of current podsTime 0s: 2 workers, queue empty, system idle
ββ Worker-1: 512MB, 0% used
ββ Worker-2: 512MB, 0% used
Time 10s: 10 large queries arrive (each needs 2GB)
ββ Query 1-2: Dispatched to Worker-1 & Worker-2
ββ Queries 3-10: Enter queue (no capacity)
ββ QueryQueue signals HPA: "Need more workers!"
Time 20s: HPA creates 4 new workers
ββ Worker-3, Worker-4, Worker-5, Worker-6: Starting...
ββ Queue depth: 8
Time 40s: New workers ready, VPA resizes them to 2GB each
ββ Queries 3-6: Dispatched to Worker-3,4,5,6
ββ Queue depth: 4
Time 60s: More workers added, all queries dispatched
ββ 6 workers total, all executing queries
ββ Queue depth: 0
Time 300s: All queries complete, workers idle
ββ SmartScaler: "Workers idle for 5 minutes, scale down"
Time 360s: HPA removes 4 workers, back to 2
ββ System at steady state
Cluster: 64GB total node memory, 60GB allocatable
Time 0s: 20 workers (max limit), each using 3GB = 60GB total
ββ System at resource ceiling
Time 10s: New query arrives, needs 2GB
ββ Available capacity: 0GB
ββ At ceiling: Cannot add more workers
ββ QueryQueue: Switch to "Saturation Mode"
ββ Action: Queue query, return estimated wait time (2 minutes)
Time 30s: Worker-5 completes its query, frees 2GB
ββ Available capacity: 2GB
ββ QueryQueue: Dispatch queued query to Worker-5
Time 31s: Query starts executing on Worker-5
ββ Queue back to 0
SELECT * FROM read_parquet('s3://my-bucket/sales/*.parquet')
WHERE year = 2024;Step 1: Client Connection
psql -h gateway.tavana.svc -p 5432 -U tavana
β
ββ> Gateway: PostgreSQL wire protocol handler accepts connection
Step 2: Query Parsing & Estimation
Gateway (QueryRouter):
β
ββ> Parse SQL: "SELECT * FROM read_parquet('s3://...')"
ββ> Extract S3 path: "s3://my-bucket/sales/*.parquet"
ββ> Call DataSizer: Estimate data size
β β
β ββ> S3 HEAD requests to get Parquet file sizes
β ββ sales/2024-01.parquet: 500MB
β ββ sales/2024-02.parquet: 480MB
β ββ Total: 980MB
β
ββ> Estimated memory needed: 980MB * 1.5 = 1.47GB
Step 3: Admission Control (QueryQueue)
QueryQueue:
β
ββ> Check available capacity across all workers:
β Worker-1: 4GB limit, 1GB used β 3GB available
β Worker-2: 4GB limit, 3GB used β 1GB available
β Total available: 4GB
β
ββ> Query needs 1.47GB, available 4GB β Can admit
β
ββ> Dispatch to Worker-1 (most available capacity)
Step 4: VPA Pre-Assignment Sizing
SmartScaler:
β
ββ> Worker-1 current memory: 4GB
ββ> Query needs: 1.47GB
ββ> Calculation: 1.47GB < 4GB * 0.8 β No resize needed
β
ββ> Dispatch query to Worker-1 without resize
Step 5: Query Execution (Worker)
Worker-1 (DuckDB):
β
ββ> Receive gRPC request: ExecuteQuery
ββ> Parse SQL and create DuckDB query plan
ββ> DuckDB httpfs extension: Open S3 connection
ββ> Read Parquet files:
β ββ Stream sales/2024-01.parquet (500MB)
β ββ Apply WHERE filter: year = 2024
β ββ Stream sales/2024-02.parquet (480MB)
β
ββ> Memory usage peaks at 1.2GB (within 4GB limit)
β
ββ> Stream results back to Gateway via gRPC
Step 6: Result Streaming
Gateway β Client:
β
ββ> Receive gRPC stream from Worker-1
ββ> Convert Arrow format β PostgreSQL row format
ββ> Stream rows to client over TCP socket
β Row 1: {id: 1, product: "Widget", amount: 99.99, ...}
β Row 2: {id: 2, product: "Gadget", amount: 149.99, ...}
β ...
β
ββ> Client receives results incrementally (no buffering)
Step 7: Cleanup
Query completes:
β
ββ> Worker-1: Release DuckDB resources, memory drops to 1GB
ββ> Gateway: Mark Worker-1 as available (3GB free)
ββ> QueryQueue: Check if any queued queries can now dispatch
β
ββ> Metrics:
ββ tavana_query_duration_seconds: 5.2s
ββ tavana_data_scanned_bytes: 980MB
ββ tavana_active_queries: 0
Tavana supports PostgreSQL-compatible server-side cursors with true streaming - meaning data is scanned only once, and each FETCH operation advances an iterator rather than re-executing the query.
| Approach | FETCH 1 | FETCH 2 | FETCH 3 | FETCH N | Complexity |
|---|---|---|---|---|---|
| LIMIT/OFFSET | Scan 1000 rows | Scan 2000 rows | Scan 3000 rows | Scan NΓ1000 rows | O(nΒ²) |
| True Streaming | Scan 1000 rows | Advance iterator | Advance iterator | Just advance | O(n) |
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Client β DECLARE β Gateway β DeclareCursorβ Worker β
β (Tableau) βββββββββββββββΊβ (pg_wire) ββββββββββββββββΊβ (DuckDB) β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β β
β βΌ
β ββββββββββββββββ
β β Execute SQL β
β β Buffer Arrow β
β β RecordBatchesβ
β ββββββββββββββββ
β β
Store CursorState: β
- worker_id (affinity) β
- uses_true_streaming=true β
ββββββββββββββββββββββββββββββββ
β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Client β FETCH N β Gateway β FetchCursor β Worker β
β (Tableau) βββββββββββββββΊβ (pg_wire) ββββββββββββββββΊβ (DuckDB) β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β β
β βΌ
β ββββββββββββββββ
β β Advance β
β β iterator βββββ NO RE-SCAN!
β β Return N rowsβ
β ββββββββββββββββ
β β
Stream rows to client β
ββββββββββββββββββββββββββββββββ
-- Efficient scrolling through 1 million rows
BEGIN;
-- Declare cursor (executes query ONCE, buffers results on worker)
DECLARE sales_cursor CURSOR FOR
SELECT * FROM delta_scan('az://bucket/sales/');
-- Fetch first batch (reads from buffer)
FETCH 2000 FROM sales_cursor;
-- Fetch next batch (advances iterator - no re-scan!)
FETCH 2000 FROM sales_cursor;
-- Continue scrolling...
FETCH 2000 FROM sales_cursor;
-- Done
CLOSE sales_cursor;
COMMIT;| Setting | Description | Default |
|---|---|---|
MAX_CURSORS |
Max concurrent cursors per worker | 100 |
CURSOR_IDLE_TIMEOUT_SECS |
Auto-close idle cursors after | 300 (5 min) |
CURSOR_CLEANUP_INTERVAL_SECS |
Cleanup check interval | 60 (1 min) |
tavana_active_cursors # Currently open cursors across all workers
tavana_cursor_fetch_count # Total FETCH operations
tavana_cursor_rows_fetched # Total rows fetched via cursors
# Deploy everything: AKS cluster, Tavana, monitoring
./deploy.sh --subscription-id YOUR_AZURE_SUB_ID --env dev# 1. Create Kind cluster
kind create cluster --name tavana
# 2. Pull images
docker pull angelerator/tavana-gateway:v1.0.0
docker pull angelerator/tavana-worker:v1.0.0
# 3. Load into Kind
kind load docker-image angelerator/tavana-gateway:v1.0.0 --name tavana
kind load docker-image angelerator/tavana-worker:v1.0.0 --name tavana
# 4. Install with Helm
helm install tavana ./helm/tavana \
--set global.imageTag=v1.0.0 \
--namespace tavana \
--create-namespace
# 5. Port forward
kubectl port-forward -n tavana svc/tavana-gateway 5432:5432
# 6. Connect
PGPASSWORD=tavana psql -h localhost -p 5432 -U tavana -d tavana# 1. Deploy Azure infrastructure
cd terraform/azure/examples/quickstart
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your values
terraform init
terraform apply
# 2. Get kubectl credentials
az aks get-credentials \
--resource-group tavana-dev-rg \
--name tavana-dev-aks
# 3. Install Tavana with Helm
helm install tavana oci://ghcr.io/angelerator/charts/tavana \
--version 1.0.0 \
--namespace tavana \
--create-namespace \
--values ./helm/tavana/values-azure.yaml
# 4. Test connection
kubectl port-forward -n tavana svc/tavana-gateway 5432:5432
psql -h localhost -p 5432 -U tavana-- Query a single Parquet file
SELECT * FROM read_parquet('s3://my-bucket/data.parquet') LIMIT 10;
-- Query with wildcard (multiple files)
SELECT * FROM read_parquet('s3://my-bucket/sales/*.parquet');
-- Query with partitioning
SELECT * FROM read_parquet('s3://my-bucket/sales/year=*/month=*/*.parquet');-- Query Delta Lake table from Azure
SELECT * FROM delta_scan('az://container/path/to/delta_table/') LIMIT 100;
-- Delta Lake with Azure blob storage URL
SELECT * FROM delta_scan('abfss://container@account.dfs.core.windows.net/table/');
-- Time travel (query historical version)
SELECT * FROM delta_scan('az://container/table/', version := 5);
-- Efficient scrolling with cursors
BEGIN;
DECLARE data_cursor CURSOR FOR
SELECT * FROM delta_scan('az://container/large_table/');
FETCH 5000 FROM data_cursor; -- First batch
FETCH 5000 FROM data_cursor; -- Next batch (no re-scan!)
CLOSE data_cursor;
COMMIT;-- Time-series aggregation
SELECT
date_trunc('month', order_date) as month,
COUNT(*) as order_count,
SUM(total_amount) as revenue,
AVG(total_amount) as avg_order_value
FROM read_parquet('s3://my-bucket/orders/**/*.parquet')
WHERE order_date >= '2024-01-01'
GROUP BY 1
ORDER BY 1;
-- Top N analysis
SELECT
product_category,
SUM(quantity) as total_sold,
SUM(revenue) as total_revenue
FROM read_parquet('s3://my-bucket/sales/**/*.parquet')
GROUP BY 1
ORDER BY total_revenue DESC
LIMIT 10;-- Join multiple datasets
SELECT
c.customer_name,
COUNT(o.order_id) as order_count,
SUM(o.total_amount) as lifetime_value
FROM read_parquet('s3://my-bucket/customers/*.parquet') c
JOIN read_parquet('s3://my-bucket/orders/*.parquet') o
ON c.customer_id = o.customer_id
GROUP BY 1
ORDER BY lifetime_value DESC
LIMIT 100;-- Window functions
SELECT
customer_id,
order_date,
total_amount,
SUM(total_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) as running_total
FROM read_parquet('s3://my-bucket/orders/*.parquet')
ORDER BY customer_id, order_date;tavana_query_queue_depth # Current number of queued queries
tavana_query_queue_wait_seconds # Histogram of wait times
tavana_query_queue_rejected_total # Count of rejected queries (should be 0)
tavana_available_capacity_mb # Available memory across all workers
tavana_resource_ceiling_mb # Max available cluster memory
tavana_operation_mode # 0=Scaling, 1=Saturation
tavana_query_duration_seconds # Histogram of query execution times
tavana_active_queries # Currently executing queries
tavana_query_errors_total # Failed queries by error type
tavana_data_scanned_bytes # Amount of data read from S3
tavana_worker_memory_bytes # Per-worker memory usage
tavana_worker_cpu_usage_percent # Per-worker CPU usage
tavana_worker_count # Current number of worker pods
tavana_worker_pool_size_changes_total # HPA scale events
tavana_hpa_scale_up_signal # HPA scale-up triggers
tavana_vpa_resize_events_total # VPA resize events
tavana_scaling_duration_seconds # Time to add/resize workers
Import pre-built dashboards from k8s/monitoring/:
- Tavana Overview: Query throughput, latency, error rate
- Auto-Scaling: HPA/VPA activity, worker count, resource utilization
- Queue Analytics: Queue depth, wait times, capacity utilization
- Worker Performance: Per-worker metrics, query distribution
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL# Default deny all traffic
kind: NetworkPolicy
metadata:
name: default-deny-all
# Allow Gateway β Worker
kind: NetworkPolicy
metadata:
name: allow-gateway-to-worker
spec:
podSelector:
matchLabels:
app: tavana-worker
ingress:
- from:
- podSelector:
matchLabels:
app: tavana-gateway
ports:
- protocol: TCP
port: 50053Azure:
serviceAccount:
annotations:
azure.workload.identity/client-id: YOUR_CLIENT_ID
azure.workload.identity/tenant-id: YOUR_TENANT_IDAWS:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/tavana-rolegraph TB
subgraph "Client Layer"
Client[SQL Client<br/>psql, DBeaver, Tableau]
end
subgraph "Kubernetes Cluster"
subgraph "Gateway Pods (2+ replicas)"
PG[PostgreSQL<br/>Wire Protocol]
QR[Query Router]
QQ[QueryQueue<br/>FIFO]
DS[DataSizer<br/>Cost Estimator]
SS[SmartScaler<br/>HPA + VPA]
WP[Worker Pool<br/>Manager]
PG --> QR
QR --> DS
QR --> QQ
QQ --> SS
QQ --> WP
SS --> K8S[Kubernetes API]
end
subgraph "Worker Pods (2-20, auto-scaled)"
W1[Worker 1<br/>DuckDB<br/>512MB-12GB]
W2[Worker 2<br/>DuckDB<br/>512MB-12GB]
WN[Worker N<br/>DuckDB<br/>512MB-12GB]
end
subgraph "Monitoring"
PROM[Prometheus]
GRAF[Grafana]
end
end
subgraph "Object Storage"
S3[S3/ADLS/GCS<br/>Parquet/CSV/JSON]
end
Client -.->|1. SQL Query| PG
DS -->|2. S3 HEAD Request| S3
WP -->|3. gRPC ExecuteQuery| W1
WP -->|3. gRPC ExecuteQuery| W2
WP -->|3. gRPC ExecuteQuery| WN
W1 -->|4. Read Data| S3
W2 -->|4. Read Data| S3
WN -->|4. Read Data| S3
W1 -.->|5. Stream Results| PG
W2 -.->|5. Stream Results| PG
WN -.->|5. Stream Results| PG
PG -.->|6. Return Rows| Client
SS -->|Scale Workers| K8S
K8S -->|Create/Resize| W1
K8S -->|Create/Resize| W2
K8S -->|Create/Resize| WN
PG -->|Metrics| PROM
W1 -->|Metrics| PROM
W2 -->|Metrics| PROM
WN -->|Metrics| PROM
PROM -->|Query| GRAF
style PG fill:#e1f5ff
style QQ fill:#fff3cd
style SS fill:#d4edda
style W1 fill:#f8d7da
style W2 fill:#f8d7da
style WN fill:#f8d7da
sequenceDiagram
participant C as Client
participant G as Gateway
participant DS as DataSizer
participant QQ as QueryQueue
participant SS as SmartScaler
participant K8S as Kubernetes
participant W as Worker
participant S3 as Object Storage
C->>G: SQL Query (PostgreSQL protocol)
activate G
G->>G: Parse SQL
Note over G: Extract S3 paths
G->>DS: Estimate query cost
activate DS
DS->>S3: S3 HEAD requests (file sizes)
S3-->>DS: File metadata (980MB)
DS-->>G: Estimated memory: 1.47GB
deactivate DS
G->>QQ: Enqueue(query, cost=1.47GB)
activate QQ
QQ->>QQ: Check available capacity
Note over QQ: Worker-1: 3GB free<br/>Worker-2: 1GB free<br/>Total: 4GB
alt Capacity Available
QQ->>QQ: Admission granted
QQ->>SS: Check if VPA resize needed
activate SS
SS->>K8S: Check worker resources
K8S-->>SS: Worker-1: 4GB limit
SS-->>QQ: No resize needed (1.47GB < 80% of 4GB)
deactivate SS
QQ->>W: Dispatch to Worker-1
deactivate QQ
activate W
W->>W: Initialize DuckDB
W->>S3: Read Parquet files
activate S3
S3-->>W: Stream data (980MB)
deactivate S3
W->>W: Execute query
Note over W: Memory peaks at 1.2GB
W->>G: Stream results (gRPC)
deactivate W
G->>C: Stream rows (PostgreSQL protocol)
deactivate G
else No Capacity
QQ->>SS: Signal scale-up needed
activate SS
SS->>K8S: Create 2 new worker pods
activate K8S
K8S-->>SS: Workers starting...
deactivate K8S
deactivate SS
Note over QQ: Query waits in queue (FIFO)
QQ-->>C: Blocking wait (transparent to client)
Note over K8S: Workers become ready (30-60s)
K8S->>QQ: Worker-3 ready
K8S->>QQ: Worker-4 ready
QQ->>W: Dispatch to Worker-3
Note over W,S3: Query executes...
W->>G: Stream results
G->>C: Return rows
end
flowchart TD
Start([Query Arrives]) --> Parse[Parse SQL and<br/>Extract S3 Paths]
Parse --> Estimate[Estimate Query Cost<br/>via DataSizer]
Estimate --> Check{Available<br/>Capacity?}
Check -->|Yes: cost within limit| Dispatch[Dispatch to Worker]
Check -->|No: cost exceeds limit| CheckCeiling{At Resource<br/>Ceiling?}
CheckCeiling -->|No: Can Scale| Signal[Signal HPA<br/>Scale-Up]
Signal --> Queue[Add to FIFO Queue]
Queue --> Wait[Block and Wait<br/>for Capacity]
Wait --> Recheck{Capacity<br/>Available?}
Recheck -->|No| Wait
Recheck -->|Yes| Dispatch
CheckCeiling -->|Yes: Max Capacity| Saturation[Enter Saturation Mode]
Saturation --> EstWait[Estimate Wait Time<br/>Based on Active Queries]
EstWait --> QueueSat[Add to FIFO Queue]
QueueSat --> WaitSat[Block and Wait<br/>Return Est Time]
WaitSat --> RecheckSat{Capacity<br/>Freed?}
RecheckSat -->|No| WaitSat
RecheckSat -->|Yes| Dispatch
Dispatch --> VPACheck{VPA Resize<br/>Needed?}
VPACheck -->|Yes: cost exceeds 80 percent| Resize[Resize Worker<br/>Memory]
VPACheck -->|No| Execute[Execute Query<br/>on Worker]
Resize --> Execute
Execute --> Stream[Stream Results<br/>to Client]
Stream --> Complete[Mark Worker<br/>Available]
Complete --> TryDispatch[Try Dispatch<br/>Queued Queries]
TryDispatch --> End([Done])
style Check fill:#fff3cd
style CheckCeiling fill:#f8d7da
style Dispatch fill:#d4edda
style Queue fill:#e1f5ff
style Signal fill:#d1ecf1
flowchart TD
Start([Scaling Loop<br/>Every 30s]) --> CheckMode{Current<br/>Mode?}
CheckMode -->|Scaling Mode| ScalingChecks[Check Scale Triggers]
CheckMode -->|Saturation Mode| SaturationChecks[Conservative Scaling]
ScalingChecks --> QueueDepth{Queue<br/>Depth greater than 0?}
QueueDepth -->|Yes| ScaleUp[Scale Up<br/>Add 2 to 4 Workers]
QueueDepth -->|No| WaitTime{Avg Wait<br/>Time over 30s?}
WaitTime -->|Yes| ScaleUp
WaitTime -->|No| CPUHigh{CPU over 70 percent<br/>or Mem over 80 percent?}
CPUHigh -->|Yes| ScaleUp
CPUHigh -->|No| IdleCheck{Queue Empty<br/>for 5+ min?}
IdleCheck -->|Yes| ActiveQ{Active<br/>Queries?}
ActiveQ -->|No| Utilization{Worker<br/>Util under 30 percent?}
Utilization -->|Yes| ScaleDown[Scale Down<br/>Remove 50 percent Workers<br/>Min: 2]
Utilization -->|No| NoAction[No Action]
ActiveQ -->|Yes| NoAction
IdleCheck -->|No| NoAction
SaturationChecks --> SatQueueDepth{Queue<br/>Very Deep?<br/>Over 20}
SatQueueDepth -->|Yes| TryScaleUp{Can Add<br/>Workers?}
TryScaleUp -->|Yes: Below ceiling| ScaleUpSat[Scale Up<br/>Add 1 Worker<br/>Conservative]
TryScaleUp -->|No: At ceiling| NoActionSat[No Action<br/>Wait for Capacity]
SatQueueDepth -->|No| LongIdle{Idle<br/>over 10 min?}
LongIdle -->|Yes| ScaleDownSat[Scale Down<br/>Remove 1 Worker<br/>Slow and Safe]
LongIdle -->|No| NoActionSat
ScaleUp --> UpdateMetrics[Update Metrics<br/>hpa_scale_up_signal]
ScaleDown --> UpdateMetricsDown[Update Metrics<br/>hpa_scale_down_total]
ScaleUpSat --> UpdateMetrics
ScaleDownSat --> UpdateMetricsDown
NoAction --> End([Sleep 30s])
NoActionSat --> End
UpdateMetrics --> Cooldown[Cooldown 60s<br/>Prevent Flapping]
UpdateMetricsDown --> CooldownDown[Cooldown 300s<br/>5 min]
Cooldown --> End
CooldownDown --> End
style ScaleUp fill:#d4edda
style ScaleDown fill:#f8d7da
style NoAction fill:#e2e3e5
style QueueDepth fill:#fff3cd
style CheckMode fill:#d1ecf1
flowchart TD
Start([Query Dispatch]) --> EstCost[Estimated Cost<br/>from DataSizer]
EstCost --> PreCheck{Pre-Assignment<br/>Check}
PreCheck --> CalcNeeded[Calculate Needed Memory<br/>needed equals cost times 1.5]
CalcNeeded --> GetCurrent[Get Worker Current Limit]
GetCurrent --> Compare{needed greater than<br/>current times 0.8?}
Compare -->|Yes| PreResize[Pre-Assignment Resize]
PreResize --> CalcNew[Calculate new size<br/>max of needed and current]
CalcNew --> CapNew{new greater than<br/>12GB max?}
CapNew -->|new > 12GB| SetMax[Set to 12GB max]
CapNew -->|new <= 12GB| SetNew[Set to new value]
SetMax --> ApplyPre[Apply VPA Resize<br/>K8s API]
SetNew --> ApplyPre
ApplyPre --> WaitReady[Wait for Pod Ready<br/>10-30s]
WaitReady --> Execute[Execute Query]
Compare -->|No| Execute
Execute --> Monitor[Monitor Execution]
Monitor --> ElasticCheck{During Execution<br/>Memory Usage?}
ElasticCheck --> CheckUtil{Util greater than<br/>80 percent limit?}
CheckUtil -->|Yes| ElasticGrow[Elastic Growth<br/>new equals current times 1.5]
ElasticGrow --> CapElastic{new greater than<br/>12GB max?}
CapElastic -->|Yes| ElasticMax[Set to 12GB max]
CapElastic -->|No| ElasticSet[Set to new value]
ElasticMax --> ApplyElastic[Apply VPA Resize<br/>In-Place if K8s 1.27+]
ElasticSet --> ApplyElastic
ApplyElastic --> ContinueExec[Continue Execution]
ContinueExec --> Complete
CheckUtil -->|No| Complete[Query Completes]
Complete --> FreeCheck[Mark Worker Available]
FreeCheck --> IdleTimer[Start Idle Timer]
IdleTimer --> IdleCheck{Idle for<br/>10+ min?}
IdleCheck -->|No| WaitIdle[Wait...]
WaitIdle --> IdleCheck
IdleCheck -->|Yes| NoQueries{No active<br/>queries?}
NoQueries -->|Yes| Shrink[Smart Shrink<br/>new equals 512MB min]
NoQueries -->|No| NoShrink[No Shrink<br/>Keep Current]
Shrink --> ApplyShrink[Apply VPA Resize]
ApplyShrink --> End([Ready for Next Query])
NoShrink --> End
style PreResize fill:#d1ecf1
style ElasticGrow fill:#fff3cd
style Shrink fill:#f8d7da
style Execute fill:#d4edda
stateDiagram-v2
[*] --> Initializing
Initializing --> FetchingNodeInfo: Start Ceiling Updater
FetchingNodeInfo --> CalculatingCeiling: Query K8s Nodes
CalculatingCeiling --> EvaluatingUtilization: Calculate Total<br/>Allocatable Memory
EvaluatingUtilization --> ScalingMode: Utilization under 80 percent
EvaluatingUtilization --> ApproachingCeiling: Utilization 80 to 90 percent
EvaluatingUtilization --> SaturationMode: Utilization over 90 percent
ScalingMode --> MonitoringScaling: Update Mode Metric<br/>operation_mode equals 0
MonitoringScaling --> FetchingNodeInfo: Wait 60s
ApproachingCeiling --> MonitoringApproaching: Log Warning<br/>Approaching Limits
MonitoringApproaching --> FetchingNodeInfo: Wait 30s
SaturationMode --> MonitoringSaturation: Update Mode Metric<br/>operation_mode equals 1<br/>Alert Operators
MonitoringSaturation --> FetchingNodeInfo: Wait 15s
state ScalingMode {
[*] --> Proactive
Proactive --> AggressiveHPA: Queue Depth over 0
AggressiveHPA --> [*]: Scale add 4 Workers
Proactive --> NormalVPA: Query Cost High
NormalVPA --> [*]: Resize Freely
}
state SaturationMode {
[*] --> Conservative
Conservative --> SlowHPA: Queue Depth over 20
SlowHPA --> [*]: Scale add 1 Worker<br/>If Possible
Conservative --> NoVPA: High Cost Query
NoVPA --> [*]: Reject or<br/>Warn User
}
note right of ScalingMode
Cluster has headroom
Scale aggressively
Low risk of OOM
end note
note right of SaturationMode
Cluster at max capacity
Be conservative
High risk of OOM
end note
sequenceDiagram
participant QR as Query Router
participant DS as DataSizer
participant S3 as Object Storage
participant Cache as Estimation Cache
QR->>QR: Parse SQL
Note over QR: Extract S3 path<br/>s3://bucket/sales/*.parquet
QR->>DS: Estimate cost for path
activate DS
DS->>Cache: Check cache
Cache-->>DS: Cache miss
DS->>DS: Expand wildcard
Note over DS: List all matching files
DS->>S3: S3 LIST objects<br/>bucket/sales/
activate S3
S3-->>DS: file1.parquet<br/>file2.parquet<br/>and more
deactivate S3
loop For each file
DS->>S3: S3 HEAD request
activate S3
S3-->>DS: Content-Length: 500MB
deactivate S3
DS->>DS: Accumulate size
end
DS->>DS: Calculate total
Note over DS: Total raw size: 2GB
DS->>DS: Apply estimation formula
Note over DS: Estimated memory equals<br/>raw_size times ratio times factor<br/>2GB times 0.3 times 1.5<br/>equals 900MB
DS->>Cache: Store result<br/>TTL: 5 min
DS-->>QR: Estimated cost: 900MB
deactivate DS
Note over Cache: Future queries to same<br/>path serve from cache
graph LR
A[Raw Data Size<br/>from S3] --> B[Compression Ratio<br/>Parquet: 0.2 to 0.4x<br/>CSV: 1.0x]
B --> C[Decompressed Size<br/>equals raw times ratio]
C --> D[Working Set<br/>plus columnar projections<br/>plus filters]
D --> E[Safety Factor<br/>times 1.5]
E --> F[Final Estimate]
style A fill:#e1f5ff
style C fill:#fff3cd
style E fill:#f8d7da
style F fill:#d4edda
Estimation Components:
- Raw Size: Sum of all file sizes from S3 (via HEAD requests)
- Compression Ratio:
- Parquet: 0.2-0.4x (highly compressed)
- CSV: 1.0x (uncompressed)
- JSON: 0.8x
- Working Set Adjustment:
SELECT *: Full decompressed size- Column projections: Proportion of selected columns
- Filters: Estimated selectivity (conservative: assume 50%)
- Safety Factor: 1.5x to account for:
- Query intermediate results
- DuckDB internal buffers
- Hash tables for joins/aggregations
- Sort buffers
Example:
Query: SELECT customer_id, amount FROM read_parquet('s3://bucket/*.parquet')
WHERE date >= '2024-01-01'
Step 1: S3 HEAD requests
- file1.parquet: 1GB
- file2.parquet: 1GB
- Total raw: 2GB
Step 2: Apply compression ratio (Parquet: 0.3x)
- Decompressed: 2GB * 0.3 = 600MB
Step 3: Column projection (2 of 10 columns)
- Selected data: 600MB * 0.2 = 120MB
Step 4: Filter selectivity (assume 50% of rows)
- After filter: 120MB * 0.5 = 60MB
Step 5: Safety factor (1.5x)
- Final estimate: 60MB * 1.5 = 90MB
Result: Assign worker with at least 90MB free memory
stateDiagram-v2
[*] --> Empty
Empty --> Queued: Query Arrives<br/>No Capacity
Empty --> Bypassed: Query Arrives<br/>Has Capacity
Bypassed --> Executing: Dispatch Immediately
Bypassed --> [*]: Query Complete
Queued --> Waiting: Signal HPA<br/>Start Wait Timer
Waiting --> Waiting: More Queries Arrive<br/>FIFO Order
Waiting --> Dispatching: Capacity Available<br/>Worker Freed
Dispatching --> Executing: Dequeue and Execute<br/>First in Queue
Dispatching --> Waiting: More in Queue<br/>Process Next
Executing --> Complete: Query Finishes
Complete --> CheckQueue: Mark Worker Available
CheckQueue --> Dispatching: Queue Not Empty
CheckQueue --> Empty: Queue Empty
state Waiting {
[*] --> CheckTimeout
CheckTimeout --> WithinLimit: Wait under 5min
CheckTimeout --> Timeout: Wait over 5min
Timeout --> [*]: Return Error<br/>to Client
WithinLimit --> [*]
}
state Executing {
[*] --> MonitorMemory
MonitorMemory --> Normal: Memory under 80 percent
MonitorMemory --> NearLimit: Memory over 80 percent
NearLimit --> VPAResize: Trigger Elastic Growth
VPAResize --> MonitorMemory
Normal --> [*]
}
note right of Queued
Queue Metrics:
tavana_query_queue_depth
tavana_query_queue_wait_seconds
tavana_estimated_wait_ms
end note
note right of Executing
Execution Metrics:
tavana_active_queries
tavana_query_duration_seconds
tavana_worker_memory_bytes
end note
stateDiagram-v2
[*] --> Idle: System Startup<br/>2 Workers, No Load
Idle --> LightLoad: 1-5 Queries<br/>Queue Empty
LightLoad --> MediumLoad: 6-15 Queries<br/>Queue Starting
MediumLoad --> HeavyLoad: 16-50 Queries<br/>Queue Building<br/>HPA Scaling Up
HeavyLoad --> Saturated: 50+ Queries<br/>At Max Workers (20)<br/>Queue Deep
Saturated --> HeavyLoad: Load Decreasing<br/>Queue Draining
HeavyLoad --> MediumLoad: Queries Complete<br/>Queue Emptying
MediumLoad --> LightLoad: Cooldown Period<br/>HPA Stable
LightLoad --> Idle: Idle > 5min<br/>HPA Scale Down
state Idle {
Workers --> 2
QueueDepth --> 0
Mode --> Scaling
note right of Workers
Minimum configuration
Ready for burst
end note
}
state LightLoad {
Workers2 --> 2_to_4
QueueDepth2 --> 0_to_2
Mode2 --> Scaling
note right of Workers2
Normal operations
Quick response
end note
}
state MediumLoad {
Workers3 --> 5_to_10
QueueDepth3 --> 3_to_10
Mode3 --> Scaling
note right of Workers3
Active scaling
Queue managed
end note
}
state HeavyLoad {
Workers4 --> 11_to_20
QueueDepth4 --> 10_to_30
Mode4 --> Scaling
note right of Workers4
Peak load
All systems active
end note
}
state Saturated {
Workers5 --> 20_MAX
QueueDepth5 --> 30_plus
Mode5 --> Saturation
note right of Workers5
At capacity ceiling
Conservative mode
Wait times high
end note
}
graph TD
subgraph "Per-Worker Capacity"
W1[Worker Pod]
W1L[Memory Limit: 4GB]
W1U[Memory Used: 1GB]
W1A[Available: 3GB]
W1 --> W1L
W1 --> W1U
W1L --> W1A
W1U --> W1A
end
subgraph "Cluster Capacity"
N1[Node 1<br/>Allocatable 32GB]
N2[Node 2<br/>Allocatable 32GB]
TC[Total Ceiling<br/>64GB]
N1 --> TC
N2 --> TC
end
subgraph "Current Utilization"
W1A --> TotalAvail[Total Available<br/>Across All Workers]
W2A[Worker 2: 2GB] --> TotalAvail
W3A[Worker 3: 1GB] --> TotalAvail
WNA[Worker N: 500MB] --> TotalAvail
TotalAvail --> Calc[Sum Available equals 6.5GB]
end
subgraph "Admission Decision"
QC[Query Cost: 2GB]
Calc --> Compare{2GB within 6.5GB?}
Compare -->|Yes| Admit[Admit Query<br/>Dispatch to Worker 1]
Compare -->|No| CheckCeil{Can Scale?}
CheckCeil -->|Used under Ceiling| ScaleHPA[Signal HPA<br/>Add Workers]
CheckCeil -->|Used at Ceiling| QueueWait[Queue and Wait<br/>Saturation Mode]
end
style Admit fill:#d4edda
style ScaleHPA fill:#fff3cd
style QueueWait fill:#f8d7da
Capacity Check Algorithm:
// Simplified pseudocode
fn can_admit_query(query_cost_mb: u64) -> AdmissionDecision {
// 1. Calculate total available capacity
let available = workers.iter()
.map(|w| w.memory_limit - w.memory_used)
.sum();
// 2. Check resource ceiling
let used = workers.iter()
.map(|w| w.memory_used)
.sum();
let ceiling = nodes.iter()
.map(|n| n.allocatable_memory)
.sum();
let utilization = used as f64 / ceiling as f64;
// 3. Make admission decision
if query_cost_mb <= available {
// Direct admission
return Admit;
} else if utilization < 0.80 {
// Can scale up
signal_hpa_scale_up();
return QueueAndWait;
} else {
// At ceiling, saturation mode
set_operation_mode(Saturation);
return QueueAndWait;
}
}- True streaming server-side cursors (O(n) FETCH performance)
- Delta Lake support with Azure Workload Identity
- Automatic Azure token refresh for long-running sessions
- Tableau Desktop compatibility
- AWS EKS Terraform module
- Query result caching (Redis)
- Enhanced monitoring (distributed tracing with OpenTelemetry)
- GCP GKE Terraform module
- Multi-tenancy (namespace isolation)
- Query optimizer (cost-based query planning)
- Catalog integration (Unity Catalog, Apache Iceberg)
- Query scheduling (cron jobs, event-driven)
- Data lineage tracking
tavana/
βββ .github/
β βββ workflows/ # CI/CD pipelines
β βββ ci.yaml # Build, test, lint
β βββ release.yaml # Multi-arch Docker builds, Helm publishing
β βββ security.yaml # Trivy scanning, cargo audit
β
βββ crates/
β βββ tavana-gateway/ # Gateway service (Rust)
β β βββ src/
β β β βββ main.rs # Entry point
β β β βββ pg_wire.rs # PostgreSQL wire protocol
β β β βββ cursors.rs # Server-side cursor support (DECLARE/FETCH/CLOSE)
β β β βββ worker_client.rs # gRPC client for workers (query + cursor APIs)
β β β βββ query_queue.rs # FIFO queue with capacity awareness
β β β βββ smart_scaler.rs # HPA + VPA logic
β β β βββ worker_pool.rs # Worker discovery & load balancing
β β β βββ data_sizer.rs # Query cost estimation
β β β βββ metrics.rs # Prometheus metrics
β β βββ Cargo.toml
β β
β βββ tavana-worker/ # Worker service (Rust)
β β βββ src/
β β β βββ main.rs # Entry point
β β β βββ executor.rs # DuckDB query execution
β β β βββ cursor_manager.rs # Server-side cursor state (Arrow iterators)
β β β βββ grpc.rs # gRPC service (ExecuteQuery, DeclareCursor, etc.)
β β β βββ streaming.rs # Result streaming
β β βββ Cargo.toml
β β
β βββ tavana-common/ # Shared library
β βββ src/
β β βββ proto/ # gRPC protobuf definitions
β β βββ auth.rs # Authentication helpers
β β βββ config.rs # Configuration management
β βββ Cargo.toml
β
βββ terraform/
β βββ azure/ # Azure infrastructure module
β βββ main.tf # AKS cluster, ACR, ADLS Gen2
β βββ variables.tf
β βββ outputs.tf
β βββ examples/
β βββ quickstart/ # Minimal setup
β βββ enterprise/ # Production-grade with HA
β
βββ helm/
β βββ tavana/ # Helm chart
β βββ Chart.yaml
β βββ values.yaml # Default values
β βββ values-azure.yaml # Azure-specific overrides
β βββ templates/
β βββ gateway-deployment.yaml
β βββ worker-deployment.yaml
β βββ hpa.yaml # Horizontal Pod Autoscaler
β βββ rbac.yaml # ServiceAccount, Role, RoleBinding
β βββ networkpolicy.yaml
β βββ ingress.yaml
β
βββ gitops-template/ # Customer GitOps repository template
β βββ apps/
β β βββ tavana.yaml # ArgoCD Application
β βββ values/
β βββ overrides.yaml # Customer-specific overrides
β
βββ deploy.sh # One-click deployment script
βββ Dockerfile.gateway # Multi-stage build for Gateway
βββ Dockerfile.worker # Multi-stage build for Worker (with DuckDB extensions)
βββ README.md # This file
βββ docs/ # Documentation
βββ CUSTOMER-DEPLOYMENT-GUIDE.md # Step-by-step deployment
| Variable | Description | Default |
|---|---|---|
RUST_LOG |
Log level (trace/debug/info/warn/error) | info |
PG_PORT |
PostgreSQL wire protocol port | 5432 |
HTTP_PORT |
Health check & metrics port | 8080 |
WORKER_SERVICE_NAME |
Kubernetes worker service name | worker |
WORKER_SERVICE_PORT |
Worker gRPC port | 50053 |
MAX_WORKERS |
Maximum worker pods (HPA limit) | 20 |
MIN_WORKERS |
Minimum worker pods (HPA limit) | 2 |
QUEUE_TIMEOUT_MS |
Max time query can wait in queue | 300000 (5 min) |
TLS_ENABLED |
Enable TLS/SSL for PostgreSQL connections | false |
TAVANA_TLS_ENABLED |
Same as TLS_ENABLED (env var used internally) | false |
TAVANA_TLS_CERT_PATH |
Path to TLS certificate (PEM format) | - |
TAVANA_TLS_KEY_PATH |
Path to TLS private key (PEM format) | - |
TAVANA_TLS_SELF_SIGNED |
Generate self-signed certificate | true |
TAVANA_TLS_COMMON_NAME |
Common name for self-signed cert | tavana.local |
| Variable | Description | Default |
|---|---|---|
RUST_LOG |
Log level | info |
GRPC_PORT |
gRPC server port | 50053 |
S3_ENDPOINT |
S3-compatible endpoint (optional) | Auto-detected |
S3_BUCKET |
Default S3 bucket (optional) | - |
AWS_REGION |
AWS region | us-east-1 |
DUCKDB_EXTENSIONS_DIR |
DuckDB extensions path | /home/tavana/.duckdb/extensions |
See helm/tavana/values.yaml for all options.
Key Configurations:
gateway:
replicaCount: 2 # High availability
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2000m"
worker:
replicaCount: 2 # Initial worker count
minReplicas: 1 # HPA minimum
maxReplicas: 20 # HPA maximum
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "12Gi" # Max VPA can resize to
cpu: "3500m"
hpa:
enabled: true
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
# TLS/SSL configuration
gateway:
tls:
enabled: true # Enable SSL support
selfSigned: true # Use self-signed cert (dev)
commonName: "tavana.example.com" # CN for self-signed
existingSecret: "" # Use existing K8s secret
existingSecretCertKey: "tls.crt" # Cert key in secret
existingSecretKeyKey: "tls.key" # Private key in secret
objectStorage:
endpoint: "" # e.g., s3.amazonaws.com or minio.local:9000
bucket: "" # Default bucket
region: "us-east-1"
pathStyle: false # true for MinIO/ADLS- Customer Deployment Guide - Step-by-step deployment for all scenarios
- Helm Chart Documentation - Helm chart reference
- Terraform Modules - Infrastructure as Code reference
- GitOps Template - ArgoCD setup guide
We welcome contributions! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Install protoc
# macOS:
brew install protobuf
# Ubuntu/Debian:
sudo apt-get install -y protobuf-compiler libprotobuf-dev
# Clone repository
git clone https://github.com/Angelerator/Tavana.git
cd Tavana
# Build
cargo build
# Run tests
cargo test --all
# Run locally (requires Kubernetes)
cargo run --bin tavana-gatewayApache License 2.0 - see LICENSE file for details.
- DuckDB - The world's fastest OLAP database engine
- Kubernetes - Container orchestration platform
- Rust - Systems programming language for safety and performance
- PostgreSQL - For the wire protocol specification
- Issues: https://github.com/Angelerator/Tavana/issues
- Discussions: https://github.com/Angelerator/Tavana/discussions
- Documentation: https://github.com/Angelerator/Tavana/tree/main/docs
Built with β€οΈ in Rust