AI-powered candidate classification system using rule-based filtering and LLMs for intelligent talent screening.
This system implements a hybrid two-stage candidate classification pipeline:
- Stage 1: Rule-Based Filtering - Uses LLM-extracted rules (GPT-5 from tier definitions to filter candidates matching Tier 1, 2, 3 criteria (targets ~80% retention)
- Stage 2: LLM Classification - Uses GPT-5-mini to perform detailed tier evaluation using tier rubric
- π Hybrid approach combining rule-based filtering with LLM classification for optimal performance and accuracy
- π Real-time progress tracking via Server-Sent Events (SSE)
- π Asynchronous processing with Celery task queue
- π― Tier-based rubric for consistent candidate evaluation
- π¨ Modern UI with Next.js 15, React 19, and Tailwind CSS 4
- π³ Docker Compose for easy local development
- βοΈ Production-ready architecture for AWS deployment
- Python 3.13 with uv package manager
- FastAPI for REST API
- SQLModel for ORM
- Celery for async task processing
- PostgreSQL 17 for data storage
- Redis for caching and task queue
- OpenAI for rule extraction (GPT-5) and classification (GPT-5-mini)
- Next.js 15 with App Router
- React 19 with TypeScript 5.x
- Tailwind CSS 4.x for styling
- shadcn/ui component library
- React Dropzone for file uploads
- pnpm package manager
- Docker and Docker Compose
- OpenAI API key
- 8GB+ RAM (for running all services)
- Clone the repository:
git clone <repository-url>
cd pas- Create environment file:
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY- Start all services:
docker-compose up -d- Access the application:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
The Docker Compose setup is configured for hot reload on both backend and frontend:
- Code changes in
backend/app/are automatically detected - Backend API restarts automatically with
--reloadflag - No rebuild needed
- Code changes in
frontend/are automatically detected - Next.js dev server with Turbopack hot reload
- No rebuild needed - just edit and save!
Note: Both services use volume mounts for instant code updates. Just edit your files and the changes will be reflected immediately!
If you prefer to run services locally without Docker:
Backend:
cd backend
uv sync
uv run uvicorn app.main:app --reloadFrontend:
cd frontend
pnpm install
pnpm dev- Upload CSV: Select a CSV file containing candidate LinkedIn profiles
- Configure Tier Criteria: Define tier definitions with name, title examples, description, and key indicators for Tiers 1-3
- Processing: Monitor real-time progress as AI filters and classifies candidates
- Download Results: Get classified candidates with tier assignments, scores, and reasoning
The system expects CSV files with the following key fields:
first_name,last_nameheadline- Current professional titleabout- Professional summarycurrent_position,company_nameperson_skills- Comma-separated skillsperson_industryeducation_experienceprevious_position_1,previous_company_1, etc.
Example: See backend/test_data/SalesQL_contacts_export-2025-09-30-093158.csv
βββββββββββββββββββ
β Frontend β Next.js 15 (Port 3000)
β (React 19) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β FastAPI β Backend API (Port 8000)
β REST API β
ββββββββββ¬βββββββββ
β
ββββββ΄βββββ¬βββββββββββββ¬ββββββββββ
β β β β
βΌ βΌ βΌ βΌ
ββββββββββ βββββββ βββββββββββ ββββββββ
βPostgresβ βRedisβ β Celery β βOpenAIβ
β 17 β β β β Worker β β API β
ββββββββββ βββββββ βββββββββββ ββββββββ
Tier definitions are customizable based on your job requirements. Example:
- Tier 1: Mid-Market Hunters / GTM Builders - Fast new logo hunters with 6-10+ years experience
- Tier 2: Enterprise AE (Execution-Heavy) - Experienced with enterprise deals but less GTM-led
- Tier 3: Too Senior, Too Junior, or Misaligned Roles - Not a fit for the motion
- Tier 4: Completely irrelevant - Auto-assigned to candidates filtered out by rule-based filtering
POST /api/jobs/upload- Upload CSV and create classification jobGET /api/jobs/{job_id}/status- Get job statusGET /api/jobs/{job_id}/progress- Stream real-time progress (SSE)GET /api/jobs/{job_id}/download- Download results CSVGET /api/jobs- List all jobsGET /health- Health check
- Rule Extraction: ~2-5 seconds per job (one-time, using GPT-5)
- Rule-Based Filtering: ~500-1000 candidates/second
- LLM Classification: ~5-10 candidates/second (using GPT-5-mini)
- Overall: ~1000 candidates in 2-4 minutes (depending on OpenAI API rate limits and filter retention rate)
- Frontend: Deploy to AWS Amplify or Vercel
- Backend API: Deploy to ECS Fargate
- Celery Worker: Deploy to ECS Fargate (auto-scaling)
- Database: RDS PostgreSQL 17
- Cache: ElastiCache Redis
- Storage: S3 for file uploads and results
See .env.example for required configuration.
# Install pre-commit hooks (run once)
pre-commit install
# Run all checks manually
pre-commit run --all-filesBackend:
- Linting: Ruff
- Type Checking: Pyright
- Formatting: Ruff format
Frontend:
- Linting: ESLint with TypeScript and Next.js rules
- Type Checking: TypeScript strict mode
- Formatting: Prettier with Tailwind CSS plugin
Quick commands:
# Backend
cd backend && uv run pre-commit run --all-files
# Frontend
cd frontend && pnpm quality
# All checks (from project root)
pre-commit run --all-files- OpenAI Rate Limits: Adjust
CONCURRENT_REQUESTSandLLM_BATCH_SIZEin services - Memory Issues: Increase Docker memory allocation or reduce batch sizes
- Database Connection: Ensure PostgreSQL is running and pgvector extension is enabled
# View all logs
docker-compose logs -f
# View specific service logs
docker-compose logs -f api
docker-compose logs -f workerProprietary - Profile Analysis System
- Tam Nguyen (npt.dc@outlook.com)