Powered by RetroCast
SynthArena is an open-source web platform for evaluating and comparing AI-driven retrosynthesis models. It provides interactive visualization, side-by-side route comparison, and a living leaderboard for transparent benchmarking in computer-aided synthesis planning (CASP).
The platform ingests standardized predictions from RetroCast, the unified evaluation framework introduced in our paper: "Procrustean Bed for AI-Driven Retrosynthesis: A Unified Framework for Reproducible Evaluation".
Live Demo: syntharena.ischemist.com
Evaluating retrosynthesis models is fragmented and unreliable:
- The Babel of Formats: AiZynthFinder outputs bipartite graphs; Retro* outputs precursor maps; DirectMultiStep outputs recursive dictionaries. Comparing them requires bespoke parsers for every model.
- Inconsistent Stocks: Starting material definitions vary by over 1000×—making reported solvability scores incomparable across publications.
- Solvability ≠ Validity: Routes marked as "solved" are validated only by endpoint availability, with no guarantee that intermediate transformations are chemically feasible.
RetroCast + SynthArena provides the missing infrastructure:
- RetroCast: A universal translation layer with adapters for 10+ models, casting all outputs into a canonical schema with cryptographic manifests for reproducibility.
- Curated Benchmarks: Stratified evaluation sets fixing PaRoutes' distribution skew. The
mkt-series uses commercial stocks for practical utility; theref-series uses standardized stocks for fair algorithmic comparison. - SynthArena: This platform provides side-by-side route comparison with diff overlays, bootstrapped confidence intervals, and a living leaderboard.
- Interactive Route Visualization: Explore predicted synthetic routes with molecule structures rendered using SMILES
- Side-by-Side Comparison: Compare predictions from any two models or inspect predicted vs. ground-truth routes with diff overlays
- Living Leaderboard: Browse stratified performance metrics (Stock-Termination Rate, Top-K Accuracy) with bootstrapped 95% confidence intervals
- Commercial Availability Tracking: See which leaf nodes are in the ASKCOS Buyables stock (300k commercially available compounds)
- Fully Reproducible: All data standardized via RetroCast with cryptographic manifests
Get the latest database dump and launch the platform:
# Download the latest database
curl -fsSL https://files.ischemist.com/syntharena/get-db.sh | bash -s
# Launch with Docker Compose
docker compose up --build -dThe platform will be available at http://localhost:1000
To use a different port, edit the ports section in docker-compose.yml (e.g., change 1000:3000 to 3000:3000).
Requirements:
- Node.js 22+
- pnpm
# Install dependencies
pnpm install
# Set up environment variables
cp .env.example .env
# Edit .env to configure your database path
# Get the database dump (see Docker option above)
# Place it in production_data/syntharena.db
# OR generate your own using the scripts below
# Run database migrations
pnpm prisma migrate deploy
# Start the development server
pnpm devThe platform will be available at http://localhost:3000
The .env file requires only one variable:
DATABASE_URL="file:./production_data/syntharena.db"For local development, you can use a different path:
DATABASE_URL="file:./prisma/dev.db"If you want to evaluate your own models or create a custom benchmark, you can generate a database from RetroCast outputs.
- Install RetroCast:
uv tool install retrocast - Run the RetroCast pipeline to generate predictions (see RetroCast docs)
- Ensure you have the following outputs:
- Benchmark definitions:
data/1-benchmarks/definitions/*.json.gz - Stock definitions:
data/1-benchmarks/stocks/*.txt - Processed routes:
data/3-processed/<benchmark>/<model>/routes.json.gz - Evaluations:
data/4-scored/<benchmark>/<model>/<stock>/evaluation.json.gz - Statistics:
data/5-results/<benchmark>/<model>/<stock>/statistics.json.gz
- Benchmark definitions:
The loading process follows this sequence:
pnpm tsx scripts/load-stock.ts \
/path/to/retrocast/data/1-benchmarks/stocks/buyables-stock.txt \
"ASKCOS Buyables Stock" \
"Compounds available from eMolecules, Sigma-Aldrich, LabNetwork, Mcule, and ChemBridge"pnpm tsx scripts/load-benchmark.ts \
/path/to/retrocast/data/1-benchmarks/definitions/mkt-cnv-160.json.gz \
"mkt-cnv-160" \
"160 targets with convergent routes, all leaves in buyables" \
--stock "ASKCOS Buyables Stock"# Load routes only
pnpm tsx scripts/load-predictions.ts \
mkt-cnv-160 \
dms-explorer-xl \
--algorithm "DirectMultiStep" \
--routes-only
# Load routes + evaluations + statistics
pnpm tsx scripts/load-predictions.ts \
mkt-cnv-160 \
dms-explorer-xl \
--algorithm "DirectMultiStep" \
--stock-path "buyables-stock" \
--stock-db "ASKCOS Buyables Stock" \
--data-dir /path/to/retrocast/dataFor batch loading multiple models, see scripts/batch-load-predictions.sh.
- Framework: Next.js 16 (App Router, React Server Components)
- Database: SQLite (via Prisma ORM)
- UI: Tailwind CSS, shadcn/ui, Radix UI
- Visualization: @xyflow/react (route graphs), Recharts (performance charts), smiles-drawer (molecular structures)
- Type Safety: TypeScript with strict mode, Zod schemas
src/
├── app/ # Next.js App Router pages
│ ├── benchmarks/ # Benchmark listing and details
│ ├── leaderboard/ # Performance comparison
│ ├── runs/ # Model prediction browser
│ └── stocks/ # Stock catalog
├── components/
│ ├── route-visualization/ # Route graph rendering
│ ├── metrics/ # Performance metrics displays
│ └── ui/ # Reusable UI components
├── lib/
│ ├── services/ # Business logic (framework-agnostic)
│ ├── validation/ # Zod schemas
│ └── route-visualization/ # Graph layout algorithms
└── types/ # TypeScript type definitions
prisma/
├── schema.prisma # Database schema
└── migrations/ # Database migrations
scripts/
├── load-benchmark.ts # Load benchmark definitions
├── load-predictions.ts # Load model predictions
├── load-stock.ts # Load stock catalogs
└── batch-load-predictions.sh # Batch loading utility
SynthArena displays data processed through the RetroCast pipeline:
- Raw Predictions: Model outputs in native formats (JSON, YAML, etc.)
- RetroCast Standardization:
retrocast adapttranslates to canonical schema - Evaluation:
retrocast scorecalculates metrics against benchmarks - Database Load: Standardized routes and scores are loaded into SQLite via scripts
- SynthArena: Interactive visualization and exploration
For details on generating predictions and scores, see the RetroCast documentation.
# Development server
pnpm dev
# Type checking
pnpm check-types
# Linting
pnpm lint
# Build for production
pnpm build
# Start production server
pnpm start
# Database operations
pnpm prisma generate # Generate Prisma client
pnpm prisma migrate dev # Create new migration
pnpm prisma migrate deploy # Apply migrations
pnpm prisma studio # Open database GUIIf you use SynthArena in your research, please cite our paper:
@misc{retrocast,
title = {Procrustean Bed for AI-Driven Retrosynthesis: A Unified Framework for Reproducible Evaluation},
author = {Anton Morgunov and Victor S. Batista},
year = {2025},
eprint = {2512.07079},
archiveprefix = {arXiv},
primaryclass = {cs.LG},
url = {https://arxiv.org/abs/2512.07079}
}We welcome contributions! This project follows the isChemist Protocol for reproducible computational chemistry research.
MIT License. See LICENSE for details.
- RetroCast Framework: github.com/ischemist/project-procrustes
- Paper: arxiv.org/abs/2512.07079
- Publication Data: files.ischemist.com/retrocast/publication-data
- Live Platform: syntharena.ischemist.com
For issues or feature requests, please open an issue on GitHub.
For general questions about RetroCast or SynthArena, contact: anton@ischemist.com