Skip to content

An example of how to build an agent, in this case a data engineering centric agent, in Tower.

Notifications You must be signed in to change notification settings

tower/data-engineering-agent-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Engineering Agent

An AI-powered data engineering assistant with an interactive chat interface. This Tower app provides expert guidance on data pipelines, ETL/ELT workflows, database design, query optimization, and more.

Features

  • Interactive Chat Interface: Modern, responsive web UI with real-time streaming responses
  • Claude AI Integration: Powered by Claude Sonnet 4.5 for expert data engineering assistance
  • Code Execution: Write and execute Python code directly in the chat
  • Tower Integration: Deploy and orchestrate applications using Tower
  • Workspace Management: Persistent file system for building and testing code
  • Package Installation: Install Python packages on-demand
  • Comprehensive Expertise: Covers data warehousing, lakehouse architecture, orchestration, streaming, and more
  • Production-Ready Advice: Get practical, actionable recommendations with working code examples

Topics Covered

The agent can help with:

  • Data Pipelines: ETL/ELT design, orchestration (Airflow, Dagster, Prefect, dbt)
  • Database Design: Schema design, optimization, normalization, dimensional modeling
  • Query Optimization: SQL performance tuning, indexing strategies
  • Data Warehouses & Lakehouses: Snowflake, BigQuery, Databricks, Apache Iceberg
  • Stream Processing: Kafka, Flink, Spark Streaming
  • Data Quality: Testing strategies, validation, monitoring
  • Python Libraries: pandas, polars, PyArrow, DuckDB, SQLAlchemy
  • Cloud Platforms: AWS, GCP, Azure data services
  • Data Modeling: Kimball, Data Vault, normalized schemas

Agent Tools

The agent has access to powerful tools that allow it to actually build and test solutions:

Code Execution Tools

  • write_python_file: Create Python files in a workspace
  • execute_python: Run Python code or scripts with timeout protection
  • read_file: Read file contents from the workspace
  • list_files: Browse workspace directory structure
  • install_package: Install Python packages via pip

Tower Integration Tools (Python API)

  • tower_deploy: Deploy Tower applications from workspace using Tower Python API (creates TAR package and uploads)
  • tower_run: Execute Tower applications on the platform using Tower Python API
  • tower_list_apps: View all deployed Tower apps using Tower Python API

The agent uses the Tower Python API bindings for all Tower operations, providing direct programmatic access without CLI commands. Deployment now creates TAR packages in-memory and uploads them via the API.

Example Interactions

Build and Test Code:

User: "Create a script that extracts data from a CSV and loads it into a database"
Agent: [Writes Python file, executes it, shows output]

Deploy to Tower:

User: "Create a Tower app that runs daily to sync data"
Agent: [Creates Towerfile, task.py, requirements.txt, deploys to Tower]

Install Dependencies:

User: "Use pandas to analyze this data"
Agent: [Installs pandas if needed, writes analysis code, runs it]

The agent uses Tower only for deploying and orchestrating applications. For quick tests and prototyping, it uses the execute_python tool.

Setup

Prerequisites

  • Python 3.11 or higher
  • Anthropic API key
  • Tower account with API access (the Tower Python package is included in dependencies)

Installation

  1. Navigate to the app directory:
cd data-engineering-agent
  1. Install dependencies:
uv pip install -e .

Configuration

Anthropic API Key: Set as a Tower secret for the agent to function:

# Add the secret to Tower
tower secrets create ANTHROPIC_API_KEY "your-api-key-here"

Tower API Authentication: The agent's tools use Tower Python API which authenticates via environment variables:

  • TOWER_API_KEY: Your Tower API key for deploying and running apps
  • TOWER_URL: Tower API URL (defaults to https://api.tower.dev)
  • TOWER_ENVIRONMENT: Environment to use (defaults to "default")

When running locally, set these in your environment:

export TOWER_API_KEY="your-tower-api-key"

When deployed on Tower, the platform automatically provides authentication context.

Local Development

Run locally with Tower:

tower run --local

The app will start on http://localhost:50051

Deployment

Deploy to Tower

  1. Deploy the app:
tower deploy
  1. Enable external accessibility:

    • Go to the Tower UI
    • Navigate to the app settings
    • Toggle "External Accessibility" to ON
  2. Access your app:

    • A unique URL will be generated for your app
    • Visit the URL to start chatting with the agent
    • The app will automatically start a run when you visit

Environment Variables

The app uses the following environment variables:

  • ANTHROPIC_API_KEY (required): Your Anthropic API key for Claude access
  • TOWER_API_KEY (optional): Tower API key for deploying/running apps via agent tools
  • TOWER_URL (optional): Tower API URL (defaults to https://api.tower.dev)
  • TOWER_ENVIRONMENT (optional): Tower environment (defaults to "default")
  • PORT (optional): Port to run the server on (defaults to 50051)
  • TOWER__HOSTNAME (auto-set): The hostname assigned by Tower

Agent Deployment Capability

The agent itself can deploy Tower apps programmatically! When you ask it to create a Tower app:

  1. Creates files: Writes Towerfile, task.py, requirements.txt, etc.
  2. Packages: Creates a TAR.GZ archive with MANIFEST
  3. Deploys: Uploads via Tower Python API using TOWER_API_KEY
  4. Returns status: Provides app name and version number

This allows the agent to build and deploy production apps entirely through the chat interface.

Usage

Starting a Conversation

Once the app is running:

  1. Open the app URL in your browser
  2. You'll see a welcome screen with suggested topics
  3. Click a suggestion or type your own question
  4. The agent will respond with detailed, practical advice

Example Questions

Pipeline Design:

  • "How do I design a scalable ETL pipeline for processing millions of records daily?"
  • "What's the best way to orchestrate dependencies between data tasks?"

Query Optimization:

  • "My PostgreSQL query is slow. How can I optimize it?"
  • "What indexes should I create for a large fact table?"

Data Modeling:

  • "How should I model customer data using dimensional modeling?"
  • "What's the difference between star schema and snowflake schema?"

Data Quality:

  • "What are best practices for data quality testing in a data warehouse?"
  • "How do I implement data validation in my pipeline?"

Tips

  • Be specific about your requirements and constraints
  • Mention your tech stack (databases, tools, cloud providers)
  • Ask follow-up questions to dive deeper
  • Request code examples when helpful

Architecture

Backend

  • FastAPI: Web framework handling HTTP requests and serving static files
  • Anthropic SDK: Integration with Claude AI for intelligent responses
  • Tower Python API: Direct programmatic access to Tower platform for running and managing apps
  • Streaming API: Real-time response streaming for better UX
  • uvicorn: ASGI server for production deployment

Frontend

  • Vanilla JavaScript: No framework dependencies, lightweight and fast
  • Server-Sent Events (SSE): Streaming responses from the API
  • Responsive Design: Works on desktop and mobile devices
  • Dark Theme: Easy on the eyes for extended usage

API Endpoints

  • GET /: Serves the chat interface
  • GET /health: Health check endpoint
  • POST /api/chat: Chat endpoint with streaming support

Development

File Structure

data-engineering-agent/
├── Towerfile              # Tower app configuration
├── pyproject.toml         # Python dependencies
├── main.py                # FastAPI application
├── tools.py               # Tool definitions and executor
├── system_prompt.md       # Agent system prompt (easily editable)
├── README.md              # This file
└── static/
    ├── index.html         # Chat UI
    └── app.js             # Frontend logic

Making Changes

  1. Edit the files locally
  2. Test with tower run --local
  3. Deploy with tower deploy
  4. Changes will be live after redeployment

Customization

Modify the System Prompt: Edit system_prompt.md to customize the agent's behavior, expertise areas, and instructions. This file is loaded at startup, so changes require redeployment.

Update the UI: Modify static/index.html and static/app.js to change the appearance or add features.

Change the Model: Update the model parameter in main.py to use different Claude models (e.g., Claude Opus for more complex reasoning).

Add or Modify Tools: Edit tools.py to add new capabilities or modify existing tool behavior.

Troubleshooting

API Key Issues

If you see "API key not configured":

  • Ensure ANTHROPIC_API_KEY is set as a Tower secret
  • Redeploy the app after adding the secret

App Not Accessible

If the external URL doesn't work:

  • Check that "External Accessibility" is enabled in app settings
  • Wait a few moments for the run to start
  • Check the Tower logs for any startup errors

Slow Responses

  • The first response may take a few seconds as the model initializes
  • Subsequent responses should be faster with streaming
  • Consider using a different model if speed is critical

Contributing

This is a Tower Apps repository. To contribute:

  1. Make your changes in a new branch
  2. Test thoroughly with tower run --local
  3. Submit a pull request with a clear description

License

This app is part of the Tower Apps collection and follows the repository's license.

Support

For issues or questions:

  • Check the Tower documentation: https://docs.tower.dev
  • Review the Tower examples in the parent repository
  • Contact the Tower team for platform-specific issues

About

An example of how to build an agent, in this case a data engineering centric agent, in Tower.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published