Skip to content

Trujillofa/depotru_database

Repository files navigation

Business Data Analyzer πŸ“Š

Comprehensive business intelligence platform for hardware store operations with AI-powered natural language SQL queries, automated reporting, and interactive visualizations.

Python License Code style: black


✨ Features

πŸ€– AI-Powered Natural Language Queries

  • Ask questions in plain English - "What are my top 10 selling products?"
  • Vanna AI integration with support for:
    • OpenAI GPT-4
    • Grok (xAI) πŸ†•
    • Anthropic Claude
    • Ollama (local, private, free)
  • Auto-generated SQL from natural language
  • Web-based chat interface at http://localhost:8084

πŸ“ˆ Comprehensive Business Analytics

  • Financial metrics (revenue, profit, margins)
  • Customer segmentation (VIP, High Value, Frequent, Regular)
  • Product performance analytics
  • Category-level profitability
  • Inventory velocity tracking
  • Trend analysis and forecasting

🎨 Automated Visualizations

  • Professional PNG reports
  • Interactive dashboards (Streamlit)
  • Category distribution charts
  • Profit margin analysis
  • Revenue breakdowns

πŸ”’ Enterprise-Grade Security

  • Environment-based configuration
  • Secure credential management
  • No hardcoded passwords
  • .env file support

πŸš€ Quick Start

Option 1: Traditional Business Analyzer

# Install dependencies
pip install pymssql python-dotenv matplotlib numpy

# Configure environment
cp .env.example .env
# Edit .env with your database credentials

# Run analysis
python src/business_analyzer_combined.py

Option 2: AI-Powered Natural Language Queries 🌟

Two implementations available (see comparison):

A) Production-Ready Grok (Recommended) 🌟:

# Install
pip install vanna chromadb pyodbc openai waitress python-dotenv pandas

# Configure .env
echo "GROK_API_KEY=xai-your-key" >> .env

# Run
python src/vanna_grok.py
# β†’ http://localhost:8084
# Ask in Spanish: "Top 10 productos mΓ‘s vendidos"

✨ Features:

  • πŸ’° Beautiful number formatting (Colombian pesos: $123.456.789)
  • πŸ€– AI-powered insights (Grok analyzes results and gives recommendations)
  • πŸ‡ͺπŸ‡Έ Spanish-optimized (Colombian business context)
  • πŸ“Š Executive summaries with each query

B) Multi-Provider (Testing):

# Install
pip install vanna chromadb pyodbc openai

# Choose provider
export GROK_API_KEY='xai-your-key'        # Grok (xAI)
# OR export OPENAI_API_KEY='sk-...'       # OpenAI
# OR export ANTHROPIC_API_KEY='sk-ant-'   # Anthropic

# Run
python src/vanna_chat.py
# β†’ http://localhost:8084

Option 3: Interactive Web Dashboard

# Install Streamlit
pip install streamlit pandas plotly

# Run dashboard
streamlit run examples/streamlit_dashboard.py

πŸ“ Project Structure

coding_omarchy/
β”œβ”€β”€ README.md                          # ⭐ You are here
β”œβ”€β”€ .env.example                       # Environment configuration template
β”œβ”€β”€ .gitignore                         # Git exclusions
β”œβ”€β”€ .gitattributes                     # Git attributes
β”œβ”€β”€ requirements.txt                   # Python dependencies
β”‚
β”œβ”€β”€ src/                              # πŸ’» Source Code
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ business_analyzer_combined.py # Main analyzer (traditional)
β”‚   β”œβ”€β”€ vanna_grok.py                 # πŸ†• AI chat (Grok-optimized, Spanish)
β”‚   β”œβ”€β”€ vanna_chat.py                 # AI chat (multi-provider support)
β”‚   β”œβ”€β”€ config.py                     # Configuration management
β”‚   └── utils/                        # Utility functions
β”‚       └── __init__.py
β”‚
β”œβ”€β”€ tests/                            # πŸ§ͺ Tests
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ test_business_metrics.py      # Business logic tests
β”‚   └── test_metabase_connection.py   # Database connection tests
β”‚
β”œβ”€β”€ docs/                             # πŸ“š Documentation
β”‚   β”œβ”€β”€ START_HERE.md                 # ⭐ Start here!
β”‚   β”œβ”€β”€ ROADMAP.md                    # πŸ—ΊοΈ Strategic roadmap & what's next
β”‚   β”œβ”€β”€ VANNA_COMPARISON.md           # πŸ†• Vanna implementations comparison
β”‚   β”œβ”€β”€ VANNA_BEAUTIFUL_OUTPUT.md     # 🎨 Beautiful output examples
β”‚   β”œβ”€β”€ VANNA_SETUP.md                # Vanna AI setup guide
β”‚   β”œβ”€β”€ ARCHITECTURE.md               # Technical architecture
β”‚   β”œβ”€β”€ ANALYSIS_SUMMARY.md           # Executive summary
β”‚   β”œβ”€β”€ IMPROVEMENT_ANALYSIS.md       # Detailed analysis
β”‚   β”œβ”€β”€ QUICK_START_IMPROVEMENTS.md   # Fast-track guide
β”‚   β”œβ”€β”€ P0_FIXES_APPLIED.md           # Critical fixes
β”‚   β”œβ”€β”€ METABASE_TROUBLESHOOTING.md   # Metabase guide
β”‚   β”œβ”€β”€ SECURITY.md                   # Security guidelines
β”‚   β”œβ”€β”€ CONTRIBUTING.md               # Development guide
β”‚   └── setup_instructions.md         # Setup instructions
β”‚
β”œβ”€β”€ examples/                         # πŸ’‘ Examples
β”‚   β”œβ”€β”€ improvements_p0.py            # Critical bug fixes
β”‚   β”œβ”€β”€ pandas_approach.py            # Modern Pandas implementation
β”‚   └── streamlit_dashboard.py        # Web dashboard
β”‚
└── data/                            # πŸ“Š Data Files
    └── database_explained.json       # Database schema documentation

🎯 Choose Your Workflow

For Business Users (No Coding Required)

# Option A: Ask questions in plain English
python src/vanna_chat.py
# "What are my top customers this month?"

# Option B: Use Metabase (Docker)
docker run -d -p 3000:3000 metabase/metabase
# Point & click dashboards

For Data Analysts

# Interactive Streamlit dashboard
streamlit run examples/streamlit_dashboard.py
# Real-time filtering, interactive charts

For Developers

# Traditional script-based analysis
python src/business_analyzer_combined.py --limit 5000
# Or use Pandas approach (10-100x faster)
python examples/pandas_approach.py

πŸ€– Vanna AI - Natural Language SQL

Supported AI Providers

Provider Cost Speed Quality Setup Difficulty
OpenAI GPT-4 $$ Fast ⭐⭐⭐⭐⭐ Easy
Grok (xAI) πŸ†• $$ Fast ⭐⭐⭐⭐ Easy
Anthropic Claude $$ Fast ⭐⭐⭐⭐⭐ Easy
Ollama (Local) Free Medium ⭐⭐⭐ Medium

Example Questions You Can Ask

πŸ’¬ "What are my top 10 selling products?"
πŸ’¬ "Show me revenue by category this month"
πŸ’¬ "Which customers have the highest order values?"
πŸ’¬ "What's my profit margin by product?"
πŸ’¬ "Show me monthly revenue trends"
πŸ’¬ "Which products have low profit margins?"
πŸ’¬ "Compare this month's revenue to last month"
πŸ’¬ "Show me my best customers in the last 90 days"

Setup Vanna with Grok (New!)

# Install dependencies
pip install vanna chromadb pyodbc

# Set your Grok API key
export GROK_API_KEY='xai-your-grok-api-key'

# Edit src/vanna_chat.py
USE_GROK = True
USE_OPENAI = False
USE_OLLAMA = False
USE_ANTHROPIC = False

# Run
python src/vanna_chat.py

πŸ“Š Traditional Business Analyzer

Command Line Options

# Basic analysis (default 1000 records)
python src/business_analyzer_combined.py

# Analyze more records
python src/business_analyzer_combined.py --limit 5000

# Analyze specific date range
python src/business_analyzer_combined.py \
  --start-date 2025-01-01 \
  --end-date 2025-10-31

# Skip re-analysis, just regenerate visualizations
python src/business_analyzer_combined.py --skip-analysis

Output Files

All reports saved to ~/business_reports/ (configurable):

  • analysis_comprehensive_YYYY-MM-DD_to_YYYY-MM-DD.json
  • business_analysis_report_YYYYMMDD_HHMMSS.png

πŸ”¬ Analysis Scripts

The repository includes specialized analysis and reporting scripts in /scripts/:

Running Analysis Scripts

# SIKA product analysis
python scripts/analysis/sika_analysis.py

# General category analysis
python scripts/analysis/run_analysis.py

# Database investigation
python scripts/analysis/investigate_deposito.py

Generating Reports

# Generate English SIKA report
python scripts/reports/generate_sika_report.py

# Generate Spanish SIKA report
python scripts/reports/generate_sika_report_es.py

# Generate general analysis report
python scripts/reports/generate_report.py

All generated reports are saved to the /reports/ directory.


βš™οΈ Configuration

Environment Variables (.env file)

# Database Connection
DB_HOST=your-server-host
DB_PORT=1433
DB_USER=your-username
DB_PASSWORD=your-password
DB_NAME=SmartBusiness

# AI Providers (choose one)
OPENAI_API_KEY=sk-your-openai-key
GROK_API_KEY=xai-your-grok-key         # πŸ†• Grok support
ANTHROPIC_API_KEY=sk-ant-your-key

# Output Configuration
OUTPUT_DIR=~/business_reports
REPORT_DPI=300
DEFAULT_LIMIT=1000

See .env.example for all options.


πŸ”’ Security

IMPORTANT: Never commit credentials to version control!

βœ… Best Practices:

  • Use .env files (already in .gitignore)
  • Use environment variables in production
  • Rotate credentials regularly
  • Use least-privilege database accounts

πŸ“– See docs/SECURITY.md for detailed guidelines.


πŸ“š Documentation

New to this project? Start here:

  1. docs/START_HERE.md - Quick overview and path selection
  2. docs/VANNA_SETUP.md - AI natural language setup (includes Grok!)
  3. docs/ANALYSIS_SUMMARY.md - Executive summary

Detailed guides:


πŸ§ͺ Testing

Quick Testing

# Quick start - Run basic tests (no dependencies required)
python run_tests.py --quick

# Run all available tests
python run_tests.py

# Run with coverage report
python run_tests.py --cov

# Using pytest directly
pytest tests/ -v

# Run specific test file
pytest tests/test_basic.py -v

Multi-Version Testing with Anaconda πŸ†•

Test on Python 3.8, 3.9, 3.10, and 3.11:

# Create Conda environment
conda env create -f environment.yml
conda activate business-analyzer

# Run tests
pytest tests/ -v --cov=src

# Test on specific Python version
conda create -n test-py310 python=3.10 -y
conda activate test-py310
pip install -r requirements.txt
pytest tests/ -v

πŸ“– Full guide: docs/ANACONDA_TESTING.md


πŸ› οΈ Development

Code Style

# Format code
black src/ tests/ examples/

# Lint
flake8 src/ tests/

# Type checking
mypy src/

# Sort imports
isort src/ tests/ examples/

Running Examples

# Critical P0 fixes
python examples/improvements_p0.py

# Modern Pandas approach (10-100x faster)
python examples/pandas_approach.py

# Interactive Streamlit dashboard
streamlit run examples/streamlit_dashboard.py

AI Agent Development πŸ€–

For AI agents working on this project:

πŸ“˜ docs/AI_AGENT_INSTRUCTIONS.md - Complete guide for developing, debugging, refactoring, and fixing this project

πŸ€– .github/agents/business-data-analyzer.agent.md - Custom GitHub Copilot agent specialized for this project

Using the Custom Agent:

In GitHub Copilot Chat or comments, use @business-data-analyzer to get specialized help:

@business-data-analyzer help me format this currency in Colombian pesos
@business-data-analyzer review this code for security issues
@business-data-analyzer write tests for this business metric function

Quick reference:

  • Architecture: Vanna AI + Grok + MSSQL + Colombian formatting
  • Branch naming: claude/feature-name-SessionID (MUST start with claude/)
  • Security: Always use require_env(), never hardcode credentials
  • Formatting: Use format_number() for Colombian pesos ($1.234.567) and percentages (45,6%)
  • Testing: Add tests for every feature, maintain 80%+ coverage
  • Language: Spanish for user messages, English for code/docs

πŸ“‹ .github/QUICK_REFERENCE.md - Quick commands and checklists


πŸ“Š What It Analyzes

Category Metrics
Financial Revenue (with/without IVA), profit margins, gross profit, average order value
Customers Segmentation (VIP, High Value, Frequent, Regular), top customers, concentration
Products Top sellers, profitability, star products, underperformers
Categories Category revenue/margins, subcategories, risk assessment
Inventory Fast movers, slow movers, velocity analysis
Trends Monthly trends, seasonal patterns, category distribution

πŸ› Troubleshooting

"No valid database configuration found"

β†’ Check .env file exists and has correct credentials

"Matplotlib not available"

β†’ pip install matplotlib numpy

Vanna AI not connecting

β†’ Check API key is set: echo $OPENAI_API_KEY or echo $GROK_API_KEY

Metabase showing wrong data

β†’ See docs/METABASE_TROUBLESHOOTING.md


πŸš€ Performance

Approach Lines of Code Performance Use Case
Current Script 1,492 Baseline Works today
Pandas Approach 200 10-100x faster Best for developers
Streamlit 300 10x faster Best for teams
Metabase 0 (GUI) Fast Best for business users
Vanna AI πŸ†• 0 (Natural Language) Real-time Best for everyone

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests: pytest tests/
  5. Format code: black src/ tests/
  6. Submit a pull request

See docs/CONTRIBUTING.md for detailed guidelines.


πŸ“„ License

[Specify your license here]


πŸ™ Acknowledgments

  • Built for hardware store business intelligence
  • Designed for SmartBusiness ERP integration
  • Compatible with Magento e-commerce
  • Vanna AI integration for natural language SQL
  • Grok (xAI) support πŸ†•

πŸ“ž Support

  • πŸ“– Documentation: See docs/ directory
  • πŸ› Issues: Open a GitHub issue
  • πŸ’‘ Questions: See docs/START_HERE.md

🎯 Quick Links

I want to... Go here...
See what's next / roadmap πŸ—ΊοΈ docs/ROADMAP.md πŸ†•
Run tests πŸ§ͺ python run_tests.py --quick + docs/TESTING.md
Use Grok AI in Spanish πŸ†• src/vanna_grok.py + docs/VANNA_COMPARISON.md
See beautiful output examples 🎨 docs/VANNA_BEAUTIFUL_OUTPUT.md
Ask questions in plain English src/vanna_chat.py + docs/VANNA_SETUP.md
Get started quickly docs/START_HERE.md
Compare Vanna implementations docs/VANNA_COMPARISON.md
Run traditional analyzer python src/business_analyzer_combined.py
Build web dashboard streamlit run examples/streamlit_dashboard.py
Fix critical bugs examples/improvements_p0.py
Understand the code docs/IMPROVEMENT_ANALYSIS.md
Secure my deployment docs/SECURITY.md

⭐ Star this repo if you find it useful!

πŸš€ Ready to get started? β†’ docs/START_HERE.md


Note: This tool processes business data. Ensure compliance with data privacy regulations (GDPR, CCPA, etc.) when handling customer information.

About

Connecting with database and analyze data

Resources

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages