Skip to content

Missing reproduction data and evaluation results #30

@jhsansom

Description

@jhsansom

Hi all, thank you very much for your work on StableToolBench. Do you provide the evaluation json files anywhere? I am referring to the json files containing the results of running run_pass_rate.sh, which contain labels for which trajectories were determined to "pass" (you have an example in data_example/pass_rate_results/virtual_chatgpt_dfs/G1_instruction_virtual_chatgpt_dfs.json). I found the reproduction data zip file you provide, but I would like to see the results of evaluating this data as well. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions