Inspect Eval Convertor

Convert your custom LLM evaluation format into the canonical Inspect AI eval format.

Quick Start

Create a new example directory:
```
mkdir examples/my_format
```
Add your input data to examples/my_format/input.* (e.g .json, .csv, .xml, .yaml)
Use an AI coding assistant to create the converter. Here's a prompt you can use (replace with the actual path to your input file):

I need you to create a converter for my LLM evaluation data format to Inspect AI's eval format.

**Repository**: This is the inspect-eval-convertor repository

**Entry Point**: Read `docs/INDEX.md` first. It explains the task, repository structure, and points to all important documentation.

**Your Task**: Create a task script at `examples/my_format/task.py` that:
- Reads my input file at `examples/my_format/input.json`
- Creates an Inspect AI Task that recreates the eval log
- Uses the `task_main()` helper to run the task and write output to `examples/my_format/input.eval`

**Important Documentation**:
- `docs/INDEX.md` - Overview and entry point
- `docs/CONVERSION_GUIDE.md` - Step-by-step guide for creating converters
- `docs/INVESTIGATING_EVAL_FILES.md` - How to investigate and validate output
- `docs/TROUBLESHOOTING.md` - Common issues and solutions

**Reference Examples**: Study these existing converters:
- `examples/simple_chat/` - Basic prompt/response conversion
- `examples/multi_turn/` - Multi-turn conversations
- `examples/tool_calling/` - Tool interactions
- `examples/csv_format/` - CSV parsing example
- `examples/xml_format/` - XML parsing example
- `examples/yaml_format/` - YAML parsing example
- `examples/forking_trajectories/` - Branching conversation paths

**Key Points**:
- Use Inspect AI's `@task` decorator to create a Task function
- Use `task_main()` from `inspect_convertor.utils` to run the task
- Define a solver (often `replay_solve()` to replay pre-recorded messages)
- Define a scorer (e.g., `score_scorer()` to read scores from metadata)
- Every sample needs at least one message and one score
- Use `ModelEvent` objects in sample metadata to populate tools for branching
- Validate the output using `inspect-convert-validate examples/my_format/input.eval`
- Investigate the output using `inspect-convert-investigate examples/my_format/input.eval`

Start by reading `docs/INDEX.md` to understand the structure, then create the converter following the patterns in the examples.

Run the task:

inspect-convert examples/my_format/input.json

Or manually:

uv run python examples/my_format/task.py examples/my_format/input.json
# Creates: examples/my_format/input.eval

Validate the output:

inspect-convert-validate examples/my_format/input.eval

What's Included

Utilities: Safe conversion helpers, validation tools, CLI commands
Examples: 8 working converters with different input formats
Documentation: Complete guides in docs/ directory
Validation: Automatic validation and investigation tools

Test Vibe-Coding a Convertor

examples/playground/input.txt is included with an output validation script to test convertor creation.

# 1. Create a converter for the playground example

# 2. Run the conversion
inspect-convert examples/playground/input.txt

# 3. Validate with specialized test
uv run python examples/playground/validate.py

# If successful, you'll see:
# ✓ Playground converter validation PASSED!

CLI Commands

# Convert input to eval format (finds task.py automatically)
inspect-convert examples/my_format/input.json
# Creates: examples/my_format/input.eval

# Validate an eval file
inspect-convert-validate path/to/file.eval

# Investigate an eval file structure
inspect-convert-investigate path/to/file.eval

# Test all examples
inspect-convert-test-all

Documentation

See docs/INDEX.md for the complete documentation index.

LLM Coding Assistant Support

This repository includes configuration files for various LLM coding assistants:

Cursor: .cursor/rules/*.mdc files with detailed patterns
Claude Code: CLAUDE.md file for Claude-specific instructions
GitHub Copilot: .github/copilot-instructions.md for repository-wide instructions
General: AGENTS.md for universal agent instructions

These files ensure consistent behavior across different AI coding assistants and help guide LLMs to follow the correct task.py approach instead of deprecated patterns.

Installation

This project uses uv for package management. Install it first: https://github.com/astral-sh/uv

# Install the package
uv pip install -e .

# For development with additional tooling
uv pip install -e ".[dev]"

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.cursor/rules		.cursor/rules
.github		.github
docs		docs
examples		examples
src/inspect_convertor		src/inspect_convertor
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inspect Eval Convertor

Quick Start

What's Included

Test Vibe-Coding a Convertor

CLI Commands

Documentation

LLM Coding Assistant Support

Installation

License

About

Uh oh!

Releases

Packages

Languages

License

ApolloResearch/vibe-code-inspect-eval-convertor

Folders and files

Latest commit

History

Repository files navigation

Inspect Eval Convertor

Quick Start

What's Included

Test Vibe-Coding a Convertor

CLI Commands

Documentation

LLM Coding Assistant Support

Installation

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages