Skip to content

tam159/pas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Profile Analysis System (PAS)

AI-powered candidate classification system using rule-based filtering and LLMs for intelligent talent screening.

Screenshot 2025-11-04 at 22 43 34

Overview

This system implements a hybrid two-stage candidate classification pipeline:

  1. Stage 1: Rule-Based Filtering - Uses LLM-extracted rules (GPT-5 from tier definitions to filter candidates matching Tier 1, 2, 3 criteria (targets ~80% retention)
  2. Stage 2: LLM Classification - Uses GPT-5-mini to perform detailed tier evaluation using tier rubric

Features

  • πŸš€ Hybrid approach combining rule-based filtering with LLM classification for optimal performance and accuracy
  • πŸ“Š Real-time progress tracking via Server-Sent Events (SSE)
  • πŸ”„ Asynchronous processing with Celery task queue
  • 🎯 Tier-based rubric for consistent candidate evaluation
  • 🎨 Modern UI with Next.js 15, React 19, and Tailwind CSS 4
  • 🐳 Docker Compose for easy local development
  • ☁️ Production-ready architecture for AWS deployment

Tech Stack

Backend

  • Python 3.13 with uv package manager
  • FastAPI for REST API
  • SQLModel for ORM
  • Celery for async task processing
  • PostgreSQL 17 for data storage
  • Redis for caching and task queue
  • OpenAI for rule extraction (GPT-5) and classification (GPT-5-mini)

Frontend

  • Next.js 15 with App Router
  • React 19 with TypeScript 5.x
  • Tailwind CSS 4.x for styling
  • shadcn/ui component library
  • React Dropzone for file uploads
  • pnpm package manager

Quick Start

Prerequisites

  • Docker and Docker Compose
  • OpenAI API key
  • 8GB+ RAM (for running all services)

Setup

  1. Clone the repository:
git clone <repository-url>
cd pas
  1. Create environment file:
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
  1. Start all services:
docker-compose up -d
  1. Access the application:

Development

The Docker Compose setup is configured for hot reload on both backend and frontend:

Backend Development

  • Code changes in backend/app/ are automatically detected
  • Backend API restarts automatically with --reload flag
  • No rebuild needed

Frontend Development

  • Code changes in frontend/ are automatically detected
  • Next.js dev server with Turbopack hot reload
  • No rebuild needed - just edit and save!

Note: Both services use volume mounts for instant code updates. Just edit your files and the changes will be reflected immediately!

Local Development (without Docker)

If you prefer to run services locally without Docker:

Backend:

cd backend
uv sync
uv run uvicorn app.main:app --reload

Frontend:

cd frontend
pnpm install
pnpm dev

Usage

  1. Upload CSV: Select a CSV file containing candidate LinkedIn profiles
  2. Configure Tier Criteria: Define tier definitions with name, title examples, description, and key indicators for Tiers 1-3
  3. Processing: Monitor real-time progress as AI filters and classifies candidates
  4. Download Results: Get classified candidates with tier assignments, scores, and reasoning

CSV Format

The system expects CSV files with the following key fields:

  • first_name, last_name
  • headline - Current professional title
  • about - Professional summary
  • current_position, company_name
  • person_skills - Comma-separated skills
  • person_industry
  • education_experience
  • previous_position_1, previous_company_1, etc.

Example: See backend/test_data/SalesQL_contacts_export-2025-09-30-093158.csv

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend      β”‚ Next.js 15 (Port 3000)
β”‚   (React 19)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   FastAPI       β”‚ Backend API (Port 8000)
β”‚   REST API      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚         β”‚            β”‚         β”‚
    β–Ό         β–Ό            β–Ό         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”
β”‚Postgresβ”‚ β”‚Redisβ”‚   β”‚ Celery  β”‚ β”‚OpenAIβ”‚
β”‚   17   β”‚ β”‚     β”‚   β”‚ Worker  β”‚ β”‚ API  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜

Classification Tiers

Tier definitions are customizable based on your job requirements. Example:

  • Tier 1: Mid-Market Hunters / GTM Builders - Fast new logo hunters with 6-10+ years experience
  • Tier 2: Enterprise AE (Execution-Heavy) - Experienced with enterprise deals but less GTM-led
  • Tier 3: Too Senior, Too Junior, or Misaligned Roles - Not a fit for the motion
  • Tier 4: Completely irrelevant - Auto-assigned to candidates filtered out by rule-based filtering

API Endpoints

  • POST /api/jobs/upload - Upload CSV and create classification job
  • GET /api/jobs/{job_id}/status - Get job status
  • GET /api/jobs/{job_id}/progress - Stream real-time progress (SSE)
  • GET /api/jobs/{job_id}/download - Download results CSV
  • GET /api/jobs - List all jobs
  • GET /health - Health check

Performance

  • Rule Extraction: ~2-5 seconds per job (one-time, using GPT-5)
  • Rule-Based Filtering: ~500-1000 candidates/second
  • LLM Classification: ~5-10 candidates/second (using GPT-5-mini)
  • Overall: ~1000 candidates in 2-4 minutes (depending on OpenAI API rate limits and filter retention rate)

Production Deployment

AWS Architecture

  1. Frontend: Deploy to AWS Amplify or Vercel
  2. Backend API: Deploy to ECS Fargate
  3. Celery Worker: Deploy to ECS Fargate (auto-scaling)
  4. Database: RDS PostgreSQL 17
  5. Cache: ElastiCache Redis
  6. Storage: S3 for file uploads and results

Environment Variables

See .env.example for required configuration.

Development

Pre-commit Hooks

# Install pre-commit hooks (run once)
pre-commit install

# Run all checks manually
pre-commit run --all-files

Code Quality

Backend:

  • Linting: Ruff
  • Type Checking: Pyright
  • Formatting: Ruff format

Frontend:

  • Linting: ESLint with TypeScript and Next.js rules
  • Type Checking: TypeScript strict mode
  • Formatting: Prettier with Tailwind CSS plugin

Quick commands:

# Backend
cd backend && uv run pre-commit run --all-files

# Frontend
cd frontend && pnpm quality

# All checks (from project root)
pre-commit run --all-files

Troubleshooting

Common Issues

  1. OpenAI Rate Limits: Adjust CONCURRENT_REQUESTS and LLM_BATCH_SIZE in services
  2. Memory Issues: Increase Docker memory allocation or reduce batch sizes
  3. Database Connection: Ensure PostgreSQL is running and pgvector extension is enabled

Logs

# View all logs
docker-compose logs -f

# View specific service logs
docker-compose logs -f api
docker-compose logs -f worker

License

Proprietary - Profile Analysis System

Contributors

About

AI-powered Candidate Classification System

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published