Skip to content

tsai00/DataInfraPilot

Repository files navigation

DataInfraPilot

DataInfraPilot is a web application that helps small data engineering teams easily deploy and manage selected data engineering tools on Kubernetes clusters running on cost-effective cloud providers.


Project Structure

DataInfraPilot/
├── backend/      # FastAPI backend API
├── frontend/     # React frontend web app
├── demo/         # Example demo project
├── docker-compose.yml
└── README.md
  • Backend: REST API for orchestration and management
  • Frontend: User interface for deployment and monitoring
  • Demo Project: Example pipeline and usage

Features

  • K8s cluster provisioning
  • Selected Data Engineering tools deployment
  • Cluster autoscaling through the Cluster Autoscaler
  • Automated SSL certificate provisioning through the Cert Manager

Supported Cloud Providers

  • Hetzner

Supported Applications

  • Apache Airflow
  • Apache Spark
  • Grafana

Technology Stack

Part Main Technologies
Backend Python, FastAPI
Frontend React, TypeScript, Vite, Tailwind CSS, shadcn-ui
Demo Project Airflow, PostgreSQL, Python, Grafana

Running Locally

Manual (separate terminals)

pre-commit install

Backend

cd backend
uv sync
source .venv/bin/activate
uvicorn src.api.main:app --reload

Frontend

cd frontend
npm install
npm run dev

With Docker Compose

docker-compose up

Additionally, an SSH key pair is required to use it for server access. Create one under ~/.ssh directory (this directory will be mounted to the Docker container), and you will be prompted for the path when creating a new cluster.

Screenshots

Cluster provisioning

image info

image info

image info

Application deployment

image info

image info

image info

image info

image info

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published