SynthLabs Reasoning Generator

Create high-quality synthetic reasoning datasets for training AI models

Features

Generator Mode

Create synthetic datasets from scratch using AI-powered generation. Define topics, customize prompts, and generate high-quality reasoning traces in the SYNTH format.

Core idea: SYNTH The New Data Frontier by PleIAs

Converter Mode

Transform existing datasets into reasoning-enhanced formats. Full HuggingFace integration lets you search, preview, and convert public datasets with automatic reasoning trace generation.

DEEP Mode

Multiple AI agents working together in sophisticated pipelines:

Meta Agent: Analyzes and plans approach
Retrieval Agent: Gathers relevant information
Derivation Agent: Builds logical chains
Writer Agent: Composes the response
Rewriter Agent: Polishes and refines

Multi-turn Support

Go beyond single Q&A pairs:

Generate multi-turn conversations
Let the model ask follow-up questions
Choose responders using SYNTH-style thinking
Perfect for dialogue and instruction-following datasets

Data Preview

Have data but unsure what's inside? Explore it directly with our HuggingFace-style table viewer:

Column type detection (string, number, array, object)
Search and filter capabilities
Fullscreen expansion with pagination
Click any row to see full details

Verifier View

Quality control your generated data:

Review and evaluate entries
Remove duplicates automatically
Assign ratings (1-5 stars)
Export only verified, high-quality data

Cloud Integration

Seamless Firebase/Firestore support:

Development Mode: Download data directly as JSONL files
Production Mode: Upload to your Firestore database with one click
Session management and persistence
Real-time sync across devices

Additional Features

Feature	Description
Multiple Providers	Support for Gemini, OpenAI, Anthropic, and custom endpoints
Concurrent Workers	Parallel processing for faster generation
Smart Retry	Automatic retry with exponential backoff
Session Management	Save, load, and manage multiple generation sessions
Export Formats	JSONL, JSON, and Parquet support
HuggingFace Upload	Push directly to HuggingFace Hub

Quick Start

Prerequisites

Node.js 18+ OR Bun 1.0+
API keys for your preferred provider(s)

Installation

Clone and install dependencies:

git clone <repository-url>
cd synthlabs-reasoning-generator

# Using npm
npm install

# OR using Bun (faster)
bun install

Configure API keys:

Copy .env.example to .env.local and add your keys:

cp .env.example .env.local

Edit .env.local with your API keys:

VITE_GEMINI_API_KEY=your-gemini-key
VITE_OPENAI_API_KEY=your-openai-key
VITE_ANTHROPIC_API_KEY=your-anthropic-key
# Add other provider keys as needed

Run the app:

# Using npm
npm run dev

# Frontend only (custom port)
npm run dev:client -- --port 3000

# OR using Bun (standalone)
bun run bun:dev

Open in browser: Navigate to http://localhost:3000

Step-by-Step Walkthrough

Follow this guide to get started with SynthLabs Reasoning Generator.

1. Dashboard Overview

The main dashboard gives you quick access to all your generation sessions and configuration settings.

📸 Screenshot

2. Configuring Providers

Navigate to Settings > API Keys to configure your AI providers. For local inference or custom endpoints (e.g. vLLM, Aphrodite):

Select Other (Custom).
Set your Base URL (e.g., http://localhost:8001/v1).
Enter your Model ID (e.g., Qwen/Qwen3-14B).

📸 Screenshot

3. Creating a Session

In the Generator (or Engine) tab, you can define your dataset parameters, customize system prompts, and configure output fields.

📸 Screenshot

4. Production Mode

Navigate to Settings > DB Provider to configure your database providers. Switch to Prod (Cloud) mode to access and manage your cloud-persisted sessions. This allows you to collaborate and sync data across devices.

📸 Screenshot

5. Reviewing Generated Data

Use the Verifier (or Review) interface to inspect generated samples, check reasoning traces within <think> tags, and validate data quality.

📸 Screenshot

6. AI Data Assistant

Interact with your data using the integrated AI assistant to analyze patterns, summarize findings, or ask questions about your dataset.

📸 Screenshot

Backend (optional)

This repo includes a minimal Node backend to handle Firebase Admin operations.

Set backend env vars (example):

VITE_BACKEND_URL=http://localhost:8787
VITE_SESSION_LIST_PAGE_SIZE=50
VITE_SESSION_LIST_TTL_MS=60000
VITE_SESSION_MAX_TEXT_LEN=10000
SESSION_LIST_TTL_MS=60000
BACKEND_JSON_LIMIT_MB=10
FIREBASE_PROJECT_ID=your-project-id
FIREBASE_CLIENT_EMAIL=your-service-account-email
FIREBASE_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"

Run (Vite + backend):
```
npm run dev
```

The frontend will use the backend when VITE_BACKEND_URL is set.

Port conflicts & multiple instances

If 8787 is busy (e.g., multiple desktop windows), the backend will auto-increment to the next available port (default range: 8787-8797). The frontend will probe /health and attach to the first healthy backend in that range.

Optional envs:

# Backend port behavior
PORT=8787
PORT_RANGE=10

# Frontend discovery (falls back to VITE_BACKEND_URL if healthy)
VITE_BACKEND_PORT_START=8787
VITE_BACKEND_PORT_RANGE=10

You can also set these in .env.example and copy to .env.local.

Bun Commands

Command	Description
`bun install`	Install dependencies
`bun run bun:dev`	Start dev server with Bun runtime
`bun run bun:build`	Build for production
`bun run bun:preview`	Preview production build

Electron Desktop App

Build standalone desktop applications for Windows, macOS, and Linux using Electron.

Electron Commands

Command	Description
`npm run electron:dev`	Run in development mode (with hot reload)
`npm run electron:build`	Build for all platforms
`npm run electron:build:win`	Build Windows installer (NSIS + portable)
`npm run electron:build:mac`	Build macOS app (DMG + ZIP)

Building for Windows

On Windows or cross-platform:

npm run electron:build:win

Output files will be in the release/ directory:

SynthLabs Reasoning Generator Setup X.X.X.exe - NSIS installer
SynthLabs Reasoning Generator X.X.X.exe - Portable executable

Building for macOS

On macOS:

npm run electron:build:mac

Output files will be in the release/ directory:

SynthLabs Reasoning Generator-X.X.X.dmg - Disk image
SynthLabs Reasoning Generator-X.X.X-mac.zip - ZIP archive

Both builds support:

x64 (Intel) architecture
arm64 (Apple Silicon) architecture
Code signing with hardened runtime
Network permissions for API calls

Building for Linux

npm run electron:build

Output files:

SynthLabs Reasoning Generator-X.X.X.AppImage - Universal Linux app
synthlabs-reasoning-generator_X.X.X_amd64.deb - Debian/Ubuntu package

Requirements for Building

Windows:

Windows 10 or later
Node.js 18+
No additional dependencies required

macOS:

macOS 10.15 (Catalina) or later
Xcode Command Line Tools: xcode-select --install
Node.js 18+
For code signing: Apple Developer account (optional, for distribution)

Linux:

Any modern Linux distribution
Node.js 18+
Build tools: sudo apt-get install build-essential (Debian/Ubuntu)

Development Workflow

Start development server:
```
npm run electron:dev
```
This runs Vite dev server and Electron concurrently with hot reload.
Build for production:
```
npm run electron:build
```
Test the built app:
- Run the installer/exe/dmg from release/ directory
- All features work the same as the web version

Configuration

Electron settings are in electron/main.js:

Window size, icon, and appearance
Menu configuration
Security settings (context isolation enabled)
Platform-specific behavior

electron-builder configuration is in package.json under the build section:

Output directories
Platform-specific targets
Code signing and entitlements
Installer options

Custom Prompts

The generator supports dynamic prompt sets. You can create your own "persona" or logical framework by adding files to the prompts/ directory.

Create a New Prompt Set

Create a new folder in prompts/ (e.g., prompts/my-set/).
Inside your set folder, create subdirectories for each category:
- generator/
- converter/
- verifier/
Add .txt files for each role. The app will automatically discover your set and show it in the Settings > Prompts tab.

Directory Structure & Roles

prompts/
  └── <set_name>/
      ├── generator/
      │   ├── system.txt      (Main generator persona)
      │   ├── meta.txt        (Task decomposition)
      │   ├── retrieval.txt   (Constraint identification)
      │   ├── derivation.txt  (Logical reasoning chains)
      │   ├── responder.txt   (Final answer formulation)
      │   └── user_agent.txt  (Multi-turn interaction agent)
      ├── converter/
      │   ├── system.txt      (Main converter persona)
      │   ├── writer.txt      (Writing the final reasoning trace)
      │   └── rewriter.txt    (Polishing converted output)
      └── verifier/
          ├── query_rewrite.txt
          ├── reasoning_rewrite.txt
          ├── answer_rewrite.txt
          └── message_rewrite.txt

Tip

If a specific role file is missing in your custom set, the system will automatically fall back to the version in the default set.

Firebase Setup (Optional)

For cloud persistence and production mode, set up Firestore:

Create a Firebase project at console.firebase.google.com
Enable Firestore Database
Add these Security Rules (Firestore Database → Rules):

rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {
    match /synth_logs/{document=**} {
      allow read, write: if true; # change if needed (too open for production)
    }
    match /synth_sessions/{document=**} {
      allow read, write: if true;  # change if needed (too open for production)
    }
  }
}

Configure your Firebase credentials in the app's settings panel

Output Format

Generated data follows the SYNTH format:

{
  "query": "What is the capital of France?",
  "reasoning": "<think>The user is asking about geography...</think>",
  "answer": "The capital of France is Paris.",
  "messages": [...],
  "isMultiTurn": false,
  "metadata": {
    "provider": "gemini",
    "model": "gemini-2.0-flash",
    "timestamp": 1704067200000
  }
}

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

License

This project is licensed under the Apache 2.0 License.

Citation

If you find this tool useful, please cite it as:

@misc{synthlabs,
    author = {Kurman, Mariusz},
    title = {SYNTHLabs Reasoning Generator},
    howpublished = {\url{https://github.com/mkurman/synthlabs}},
    year = {2026}
}

Thank you!

Built with ❤️ for the AI research community

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
assets		assets
build		build
components		components
docker		docker
electron		electron
hooks		hooks
interfaces		interfaces
plugins		plugins
prompts		prompts
scripts		scripts
server		server
services		services
utils		utils
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
App.tsx		App.tsx
CONTRIBUTING.md		CONTRIBUTING.md
ELECTRON_SETUP.md		ELECTRON_SETUP.md
LICENSE		LICENSE
README.md		README.md
REFACTORING_SUMMARY.md		REFACTORING_SUMMARY.md
bun.lock		bun.lock
bunfig.toml		bunfig.toml
constants.ts		constants.ts
index.css		index.css
index.html		index.html
index.tsx		index.tsx
json-repair-js.d.ts		json-repair-js.d.ts
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
server.ts		server.ts
tsconfig.json		tsconfig.json
types.ts		types.ts
vite.config.ts		vite.config.ts

License

mkurman/synthlabs

Folders and files

Latest commit

History

Repository files navigation

SynthLabs Reasoning Generator

Features

Generator Mode

Converter Mode

DEEP Mode

Multi-turn Support

Data Preview

Verifier View

Cloud Integration

Additional Features

Quick Start

Prerequisites

Installation

Step-by-Step Walkthrough

1. Dashboard Overview

2. Configuring Providers

3. Creating a Session

4. Production Mode

5. Reviewing Generated Data

6. AI Data Assistant

Backend (optional)

Port conflicts & multiple instances

Bun Commands

Electron Desktop App

Electron Commands

Building for Windows

Building for macOS

Building for Linux

Requirements for Building

Development Workflow

Configuration

Custom Prompts

Create a New Prompt Set

Directory Structure & Roles

Firebase Setup (Optional)

Output Format

Contributing

License

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages