Skip to content

DSEAL: The annotation process employed for the benchmark creation #8

@Fardeen786-eng

Description

@Fardeen786-eng

Hi, I appreciate the efforts to develop a benchmark to evaluate ML agent systems in every state possible. But, I am most curious about the annotation process that was used to create these benchmarks. I think it was one of the TODOs for getting tutorials for Developing New Benchmarks. I went through the paper https://arxiv.org/pdf/2402.17168 correct me if I am wrong we use the 31 Kaggle datasets and available notebooks to come up with certain problem sketches which we then convert into individual problems (query, validator, etc) that form 1 problem set in our benchmark. Could I get more insights into this process and how we used LLMs to come up with them and refine them through human annotation?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions