Releases: MadeAgents/ColorBench
Releases · MadeAgents/ColorBench
ColorBench code release
ColorBench is a graph-structured benchmark designed to evaluate mobile GUI agents on complex, long-horizon tasks composed of multiple atomic operations. This project provides:
A graph-based benchmark construction methodology to expand or reconstruct environments.
A plug-and-play evaluation framework for safe, reproducible testing.
ColorBench
ColorBench is a graph-structured benchmark designed to evaluate mobile GUI agents on complex, long-horizon tasks composed of multiple atomic operations. This project provides:
- A graph-based benchmark construction methodology to expand or reconstruct environments.
- A plug-and-play evaluation framework for safe, reproducible testing.