Profile Analysis System (PAS)

AI-powered candidate classification system using rule-based filtering and LLMs for intelligent talent screening.

Overview

This system implements a hybrid two-stage candidate classification pipeline:

Stage 1: Rule-Based Filtering - Uses LLM-extracted rules (GPT-5 from tier definitions to filter candidates matching Tier 1, 2, 3 criteria (targets ~80% retention)
Stage 2: LLM Classification - Uses GPT-5-mini to perform detailed tier evaluation using tier rubric

Features

🚀 Hybrid approach combining rule-based filtering with LLM classification for optimal performance and accuracy
📊 Real-time progress tracking via Server-Sent Events (SSE)
🔄 Asynchronous processing with Celery task queue
🎯 Tier-based rubric for consistent candidate evaluation
🎨 Modern UI with Next.js 15, React 19, and Tailwind CSS 4
🐳 Docker Compose for easy local development
☁️ Production-ready architecture for AWS deployment

Tech Stack

Backend

Python 3.13 with uv package manager
FastAPI for REST API
SQLModel for ORM
Celery for async task processing
PostgreSQL 17 for data storage
Redis for caching and task queue
OpenAI for rule extraction (GPT-5) and classification (GPT-5-mini)

Frontend

Next.js 15 with App Router
React 19 with TypeScript 5.x
Tailwind CSS 4.x for styling
shadcn/ui component library
React Dropzone for file uploads
pnpm package manager

Quick Start

Prerequisites

Docker and Docker Compose
OpenAI API key
8GB+ RAM (for running all services)

Setup

Clone the repository:

git clone <repository-url>
cd pas

Create environment file:

cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

Start all services:

docker-compose up -d

Access the application:

Development

The Docker Compose setup is configured for hot reload on both backend and frontend:

Backend Development

Code changes in backend/app/ are automatically detected
Backend API restarts automatically with --reload flag
No rebuild needed

Frontend Development

Code changes in frontend/ are automatically detected
Next.js dev server with Turbopack hot reload
No rebuild needed - just edit and save!

Note: Both services use volume mounts for instant code updates. Just edit your files and the changes will be reflected immediately!

Local Development (without Docker)

If you prefer to run services locally without Docker:

Backend:

cd backend
uv sync
uv run uvicorn app.main:app --reload

Frontend:

cd frontend
pnpm install
pnpm dev

Usage

Upload CSV: Select a CSV file containing candidate LinkedIn profiles
Configure Tier Criteria: Define tier definitions with name, title examples, description, and key indicators for Tiers 1-3
Processing: Monitor real-time progress as AI filters and classifies candidates
Download Results: Get classified candidates with tier assignments, scores, and reasoning

CSV Format

The system expects CSV files with the following key fields:

first_name, last_name
headline - Current professional title
about - Professional summary
current_position, company_name
person_skills - Comma-separated skills
person_industry
education_experience
previous_position_1, previous_company_1, etc.

Example: See backend/test_data/SalesQL_contacts_export-2025-09-30-093158.csv

Architecture

┌─────────────────┐
│   Frontend      │ Next.js 15 (Port 3000)
│   (React 19)    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   FastAPI       │ Backend API (Port 8000)
│   REST API      │
└────────┬────────┘
         │
    ┌────┴────┬────────────┬─────────┐
    │         │            │         │
    ▼         ▼            ▼         ▼
┌────────┐ ┌─────┐   ┌─────────┐ ┌──────┐
│Postgres│ │Redis│   │ Celery  │ │OpenAI│
│   17   │ │     │   │ Worker  │ │ API  │
└────────┘ └─────┘   └─────────┘ └──────┘

Classification Tiers

Tier definitions are customizable based on your job requirements. Example:

Tier 1: Mid-Market Hunters / GTM Builders - Fast new logo hunters with 6-10+ years experience
Tier 2: Enterprise AE (Execution-Heavy) - Experienced with enterprise deals but less GTM-led
Tier 3: Too Senior, Too Junior, or Misaligned Roles - Not a fit for the motion
Tier 4: Completely irrelevant - Auto-assigned to candidates filtered out by rule-based filtering

API Endpoints

POST /api/jobs/upload - Upload CSV and create classification job
GET /api/jobs/{job_id}/status - Get job status
GET /api/jobs/{job_id}/progress - Stream real-time progress (SSE)
GET /api/jobs/{job_id}/download - Download results CSV
GET /api/jobs - List all jobs
GET /health - Health check

Performance

Rule Extraction: ~2-5 seconds per job (one-time, using GPT-5)
Rule-Based Filtering: ~500-1000 candidates/second
LLM Classification: ~5-10 candidates/second (using GPT-5-mini)
Overall: ~1000 candidates in 2-4 minutes (depending on OpenAI API rate limits and filter retention rate)

Production Deployment

AWS Architecture

Frontend: Deploy to AWS Amplify or Vercel
Backend API: Deploy to ECS Fargate
Celery Worker: Deploy to ECS Fargate (auto-scaling)
Database: RDS PostgreSQL 17
Cache: ElastiCache Redis
Storage: S3 for file uploads and results

Environment Variables

See .env.example for required configuration.

Development

Pre-commit Hooks

# Install pre-commit hooks (run once)
pre-commit install

# Run all checks manually
pre-commit run --all-files

Code Quality

Backend:

Linting: Ruff
Type Checking: Pyright
Formatting: Ruff format

Frontend:

Linting: ESLint with TypeScript and Next.js rules
Type Checking: TypeScript strict mode
Formatting: Prettier with Tailwind CSS plugin

Quick commands:

# Backend
cd backend && uv run pre-commit run --all-files

# Frontend
cd frontend && pnpm quality

# All checks (from project root)
pre-commit run --all-files

Troubleshooting

Common Issues

OpenAI Rate Limits: Adjust CONCURRENT_REQUESTS and LLM_BATCH_SIZE in services
Memory Issues: Increase Docker memory allocation or reduce batch sizes
Database Connection: Ensure PostgreSQL is running and pgvector extension is enabled

Logs

# View all logs
docker-compose logs -f

# View specific service logs
docker-compose logs -f api
docker-compose logs -f worker

License

Proprietary - Profile Analysis System

Contributors

Tam Nguyen (npt.dc@outlook.com)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Profile Analysis System (PAS)

Overview

Features

Tech Stack

Backend

Frontend

Quick Start

Prerequisites

Setup

Development

Backend Development

Frontend Development

Local Development (without Docker)

Usage

CSV Format

Architecture

Classification Tiers

API Endpoints

Performance

Production Deployment

AWS Architecture

Environment Variables

Development

Pre-commit Hooks

Code Quality

Troubleshooting

Common Issues

Logs

License

Contributors

About

Uh oh!

Releases

Packages

Languages

tam159/pas

Folders and files

Latest commit

History

Repository files navigation

Profile Analysis System (PAS)

Overview

Features

Tech Stack

Backend

Frontend

Quick Start

Prerequisites

Setup

Development

Backend Development

Frontend Development

Local Development (without Docker)

Usage

CSV Format

Architecture

Classification Tiers

API Endpoints

Performance

Production Deployment

AWS Architecture

Environment Variables

Development

Pre-commit Hooks

Code Quality

Troubleshooting

Common Issues

Logs

License

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages