RegLex AI - SEBI Compliance Verification System

A comprehensive AI-powered legal document compliance verification system built for SEBI (Securities and Exchange Board of India) regulations with real-time GCP integration. The system analyzes legal clauses in documents stored in Google Cloud Storage and performs live compliance verification using multiple LLM providers and advanced document processing.

🚀 Features

✅ Real-time GCP Integration - Live document storage and retrieval from Google Cloud Storage
✅ FastAPI Backend - Python FastAPI with full CORS support and auto-reload
✅ Multi-LLM Support - Claude, Gemini, OpenAI, and Mistral integration with fallback
✅ Real-time Document Analysis - Live compliance analysis using Python processing pipeline
✅ Advanced Document Processing - PDF text extraction and clause segmentation
✅ GCP Document Storage - Secure document storage with metadata management
✅ Live Dashboard Updates - Real-time statistics from GCP-stored documents
✅ Risk Assessment Engine - Automated categorization and scoring of compliance risks
✅ Modern UI - Next.js 14 with TypeScript, Tailwind CSS, and Shadcn UI
✅ Real-time Health Monitoring - Backend connectivity and performance tracking
✅ Export Functionality - Export compliance reports in JSON, CSV, and PDF formats
✅ Performance Monitoring - Real-time system performance and memory usage tracking

🏗️ System Architecture

┌─────────────────┐    HTTP/REST API    ┌──────────────────┐    ┌──────────────────┐
│   Next.js 14    │◄──────────────────►│   FastAPI        │◄──►│ Google Cloud     │
│   Frontend       │      CORS Enabled   │   Backend        │    │ Storage          │
│                 │                     │                  │    │                  │
│  - TypeScript   │                     │  - Python 3.11+  │    │  - Documents      │
│  - Tailwind CSS │                     │  - Gemini API    │    │  - Metadata       │
│  - Shadcn UI    │                     │  - PDF Processing│    │  - Analysis       │
│  - React Query  │                     │  - ML Pipeline   │    │  - Results        │
└─────────────────┘                     └──────────────────┘    └──────────────────┘
         │                                       │                        │
         │                                       │                        │
         ▼                                       ▼                        ▼
┌─────────────────┐                     ┌──────────────────┐    ┌──────────────────┐
│  Browser/Client │                     │  Processing      │    │  Live Data       │
│  - File Upload  │                     │  Pipeline        │    │  Storage         │
│  - Real-time UI │                     │  - Clause Extract│    │  - Real Metrics  │
│  - Live Updates │                     │  - Risk Analysis │    │  - Compliance    │
└─────────────────┘                     │  - LLM Verify    │    │  - Statistics    │
                                        └──────────────────┘    └──────────────────┘
                                                │
                                                ▼
                                     ┌──────────────────┐
                                     │  External APIs   │
                                     │  - Gemini AI     │
                                     │  - Claude        │
                                     │  - OpenAI        │
                                     │  - Mistral       │
                                     └──────────────────┘

🛠️ Tech Stack

Frontend

Framework: Next.js 14 with App Router
Language: TypeScript
Styling: Tailwind CSS
Components: Shadcn UI (Radix UI primitives)
State Management: React Query (TanStack Query)
Icons: Lucide React
Charts: Recharts
Animations: GSAP
Forms: React Hook Form with Zod validation

Backend

Framework: FastAPI
Language: Python 3.11+
AI Integration: Google Gemini API
Document Processing: PDF text extraction
CORS: Configured for frontend communication
API Documentation: Automatic OpenAPI/Swagger docs

Development Tools

Package Manager: npm/yarn/pnpm
Linting: ESLint
Formatting: Prettier
Type Checking: TypeScript
Testing: Jest + Playwright

📋 Prerequisites

Node.js 18+ and npm/yarn/pnpm
Python 3.11+
Gemini API Key (for document processing)

🚀 Quick Start

1. Clone the Repository

git clone <repository-url>
cd Sebi-Hack-Final

2. Backend Setup

# Navigate to backend directory
cd Backend

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Create .env file with GCP and API keys
echo "GEMINI_API_KEY=your_gemini_api_key_here" > src/.env
echo "GEMINI_API_KEY_2=your_backup_gemini_key_here" >> src/.env
echo "GCS_BUCKET_NAME=your_gcp_bucket_name" >> src/.env
echo "GOOGLE_APPLICATION_CREDENTIALS=path/to/your/gcp-credentials.json" >> src/.env

# Start the FastAPI server with auto-reload
python -m uvicorn src.pipeline.run_pipeline:app --host 127.0.0.1 --port 8000 --reload
# Server runs on http://127.0.0.1:8000

3. Frontend Setup

# Navigate to frontend directory (in new terminal)
cd Frontend

# Install dependencies
npm install

# Create environment file
cp .env.example .env.local

# Update .env.local with your settings
echo "NEXT_PUBLIC_API_URL=http://127.0.0.1:8000" >> .env.local
echo "NEXT_PUBLIC_USE_MOCK_API=false" >> .env.local

# Start the development server
npm run dev
# Frontend runs on http://localhost:3001

4. Verify Setup

Backend Health: Visit http://127.0.0.1:8000/health
API Documentation: Visit http://127.0.0.1:8000/docs
Frontend: Visit http://localhost:3001

🎯 Usage

Document Upload and Real-time Analysis

Navigate to Dashboard: Go to http://localhost:3001/dashboard
Upload Document: Drag and drop a PDF file or click to browse
Real-time Processing: Document is stored in GCP and processed immediately
Live Analysis: Real-time compliance analysis using Python pipeline
View Results: Interactive dashboard with live GCP data updates
Document Analysis: Click on any document for detailed clause-by-clause analysis
Export Reports: Download compliance reports in JSON, CSV, and PDF formats

API Integration

The system provides REST APIs for integration:

// Health check
GET /health

// Dashboard endpoints (real GCP data)
GET /api/dashboard/overview
GET /api/dashboard/documents
GET /api/dashboard/analytics
GET /api/dashboard/notifications
GET /api/dashboard/timeline
GET /api/dashboard/analysis/{document_id}

// Document upload
POST /upload-pdf/
Content-Type: multipart/form-data
- file: PDF file
- lang: Language code (default: "en")

// API information
GET /

Frontend Components

Key components available for development:

// Document upload with progress
import { FileUpload } from '@/features/document-upload/components/FileUpload'

// Compliance dashboard
import { ComplianceChart } from '@/features/compliance-dashboard/components/ComplianceChart'

// Backend status monitoring
import { BackendStatus } from '@/components/ui/backend-status'

// FastAPI service integration
import { FastAPIService } from '@/lib/fastapi-services'

🔧 Configuration

Environment Variables

Frontend (.env.local)

# API Configuration
NEXT_PUBLIC_API_URL=http://127.0.0.1:8000
NEXT_PUBLIC_USE_MOCK_API=false
NEXT_PUBLIC_API_TIMEOUT=300000

# Feature Flags
NEXT_PUBLIC_ENABLE_ANALYTICS=true
NEXT_PUBLIC_ENABLE_NOTIFICATIONS=true

# Development
NODE_ENV=development

Backend (.env)

# GCP Configuration (Required for real data)
GCS_BUCKET_NAME=your_gcp_bucket_name
GOOGLE_APPLICATION_CREDENTIALS=path/to/your/gcp-credentials.json

# AI API Keys
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_API_KEY_2=your_backup_gemini_key_here

# Optional: Other LLM API Keys
OPENAI_API_KEY=your_openai_key_here
CLAUDE_API_KEY=your_claude_key_here
MISTRAL_API_KEY=your_mistral_key_here

📁 Project Structure

Sebi-Hack-Final/
├── Backend/                    # FastAPI Backend with GCP
│   ├── src/
│   │   ├── pipeline/          # Main processing pipeline
│   │   ├── extraction/        # PDF text extraction
│   │   ├── summerizer/        # Document summarization
│   │   ├── compliance_checker/# Compliance verification
│   │   ├── llm_provider/      # LLM integrations
│   │   └── storage/           # GCP Storage integration
│   │       └── gcs_client.py  # Google Cloud Storage client
│   ├── app.py                 # Main FastAPI application
│   └── requirements.txt       # Python dependencies
│
├── Frontend/                  # Next.js Frontend
│   ├── app/                   # App Router pages
│   ├── components/            # UI components
│   ├── features/              # Feature modules
│   ├── lib/                   # Utilities and services
│   ├── hooks/                 # Custom React hooks
│   └── public/                # Static assets
│
├── API-Documentation.md       # API documentation
├── postman-guide.md          # Postman testing guide
└── README.md                 # This file

🧪 Testing

Frontend Testing

cd Frontend

# Unit tests
npm run test

# E2E tests
npm run test:e2e

# Type checking
npm run type-check

# Linting
npm run lint

Backend Testing

cd Backend

# Run with development mode for auto-reload
python app.py dev

# Check health endpoint
curl http://127.0.0.1:8000/health

# View API documentation
open http://127.0.0.1:8000/docs

API Testing with Postman

See postman-guide.md for comprehensive API testing procedures.

☁️ GCP Setup (Required for Real Data)

Prerequisites

Google Cloud Project - Create a GCP project
GCS Bucket - Create a Cloud Storage bucket for document storage
Service Account - Create a service account with Storage Admin permissions
Credentials - Download the service account key JSON file

GCP Configuration

# Set environment variables
export GCS_BUCKET_NAME=your-sebi-compliance-bucket
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json

# Verify GCP access
python -c "from google.cloud import storage; client = storage.Client(); print('GCP Connected:', client.project)"

🚀 Deployment

Quick Deploy to Vercel

Step 1: Deploy Backend First

Create Backend Project:
- Go to vercel.com
- Click "New Project" → Import from GitHub
- Select adi0900/RegLex-AI repository
- Set root directory to Backend
Configure Backend:
- Framework Preset: Python
- Root Directory: Backend
- Build Command: pip install -r requirements.txt
- Install Command: pip install -r requirements.txt

Add Backend Environment Variables:

ENVIRONMENT=production
FRONTEND_URL=https://reg-lex-ai.vercel.app
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_API_KEY_2=your_secondary_gemini_api_key_here
GCS_BUCKET_NAME=your_gcp_bucket_name
GOOGLE_APPLICATION_CREDENTIALS={"type":"service_account","project_id":"..."}

Deploy Backend: Your backend will be available at https://reglex-backend.vercel.app (or similar)

Step 2: Deploy Frontend

Create Frontend Project:
- Click "New Project" → Import from GitHub
- Select adi0900/RegLex-AI repository
- Set root directory to Frontend
Configure Frontend:
- Framework Preset: Next.js
- Root Directory: Frontend

Add Frontend Environment Variables:

NEXT_PUBLIC_API_URL=https://reglex-backend.vercel.app
NEXT_PUBLIC_USE_MOCK_API=false
NEXT_PUBLIC_ENABLE_ANALYTICS=true
NEXT_PUBLIC_ENABLE_NOTIFICATIONS=true
NEXT_PUBLIC_API_TIMEOUT=300000

Deploy Frontend: Vercel automatically builds and deploys

Step 3: Verify Deployment

After deployment, verify that both services are working:

Test Backend: Visit https://reglex-backend.vercel.app/health
Test Frontend: Visit https://reg-lex-ai.vercel.app
Test API Connection: Check browser console for CORS errors

Troubleshooting Deployment Issues

If you encounter CORS or connection issues:

Update Backend Environment Variables:
- Go to your backend Vercel project settings
- Add: FRONTEND_URL=https://reg-lex-ai.vercel.app
- Redeploy backend
Check Frontend Environment Variables:
- Ensure NEXT_PUBLIC_API_URL points to your actual backend URL
- Redeploy frontend if changed

Verify API Endpoints:

# Test backend health
curl https://reglex-backend.vercel.app/health

# Test debug endpoint
curl https://reglex-backend.vercel.app/debug

# Test dashboard endpoint
curl https://reglex-backend.vercel.app/api/dashboard/overview

Common Vercel Deployment Issues

Issue: Backend shows as Offline

✅ Solution: Check Vercel function logs for import errors
✅ Debug: Visit /debug endpoint to see Python environment
✅ Fix: Ensure all dependencies are in requirements.txt

Issue: CORS Errors

✅ Solution: Set FRONTEND_URL environment variable in Vercel
✅ Format: https://your-frontend.vercel.app
✅ Redeploy: Required after environment variable changes

Issue: Import Errors

✅ Solution: Check pyproject.toml and requirements.txt
✅ Debug: Look at Vercel build logs for missing dependencies
✅ Fix: Add missing packages to requirements.txt

Issue: Function Timeout

✅ Solution: Increase maxDuration in vercel.json
✅ Current: 30 seconds (may need increase for document processing)
✅ Alternative: Optimize processing to complete faster

Alternative Backend Hosting

Railway: Connect GitHub, auto-detects FastAPI
Render: Deploy from GitHub with Python runtime
Google Cloud Run: gcloud run deploy --source Backend

See DEPLOYMENT.md for detailed deployment guide.

Development

# Terminal 1 - Backend with GCP
cd Backend
source venv/bin/activate
python -m uvicorn src.pipeline.run_pipeline:app --host 127.0.0.1 --port 8000 --reload

# Terminal 2 - Frontend
cd Frontend && npm run dev

Production Build

# Frontend production build
cd Frontend
npm run build
npm run start

# Backend production (with gunicorn)
cd Backend
pip install gunicorn
gunicorn -k uvicorn.workers.UvicornWorker src.pipeline.run_pipeline:app --host 0.0.0.0 --port 8000

📊 Features Deep Dive

Real-time GCP Document Processing Pipeline

PDF Upload: Multi-format file support with GCP storage validation
GCP Storage: Secure document storage with metadata management
Real-time Text Extraction: Advanced PDF parsing using Python pipeline
Live AI Analysis: Multi-LLM analysis for compliance verification
Dynamic Risk Assessment: Automated risk categorization and scoring
Live Dashboard Updates: Real-time statistics from GCP data
Professional Report Generation: Comprehensive compliance reports

Real-time Monitoring

Backend health status monitoring
Real-time upload progress tracking
Performance metrics and memory usage
Error tracking and logging

Export Capabilities

JSON Export: Complete structured data with all analysis results
CSV Export: Spreadsheet-friendly format for clause analysis
PDF Export: Professional formatted reports for sharing

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow TypeScript best practices
Use ESLint and Prettier for code formatting
Add tests for new features
Update documentation for API changes
Ensure CORS compatibility for frontend integration

🐛 Troubleshooting

Common Issues

Backend Connection Issues:

# Check if backend is running
curl http://127.0.0.1:8000/health

# Check backend logs for errors
python app.py dev

Frontend Build Issues:

# Clear cache and reinstall
rm -rf .next node_modules package-lock.json
npm install
npm run dev

CORS Errors:

Verify NEXT_PUBLIC_API_URL in frontend .env.local
Ensure backend CORS configuration includes frontend origin

Upload Errors (422):

Check file format (PDF required)
Verify file size limits
Ensure proper form-data formatting

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

SEBI for compliance standards and regulations
Google Gemini for AI-powered document analysis
Next.js and FastAPI for excellent frameworks
Shadcn UI for beautiful, accessible components

👥 Team

The RegLex AI project was developed by a talented team of professionals:

Core Development Team

Aditya - Frontend Developer & Team Leader

Leading frontend development, UI/UX and project management.
Specializes in React, Next.js, TypeScript, FastAPI.
Contact: adi1423tya@gmail.com

Nilam - Lead AI Engineer & Backend Developer

Expert in Machine Learning & NLP systems
Specialized in legal-domain AI and language model fine-tuning
Backend architecture and Compliance analysis and risk modeling

Suriya - AI/ML Developer

Risk Assessment & Analysis specialist
Former SEBI officer with deep regulatory knowledge
AI pipeline implementation

Ivan Nilesh - AI/ML Developer

Machine Learning algorithms and model development

Vrithika - Presentation

Final Presentation Overview.

Project Contact

Email: adi1423tya@gmail.com
Phone: +91-9695882854
Location: Jaipur, India

Built with ❤️ by the RegLex AI Team - September 2025

📞 Support

For support and questions:

Check the API Documentation
Review Postman Testing Guide
Check backend logs for detailed error information
Verify environment configuration

Built with ❤️ for SEBI Compliance - September 2025

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.claude		.claude
Backend		Backend
Frontend		Frontend
.gitignore		.gitignore
Cloud.md		Cloud.md
DEPLOYMENT.md		DEPLOYMENT.md
README.md		README.md
create_test_pdf.py		create_test_pdf.py
test.txt		test.txt
test_frontend_upload.js		test_frontend_upload.js

nilam576/gen-ai-hack

Folders and files

Latest commit

History

Repository files navigation