Hi all, thank you very much for your work on StableToolBench. Do you provide the evaluation json files anywhere? I am referring to the json files containing the results of running run_pass_rate.sh, which contain labels for which trajectories were determined to "pass" (you have an example in data_example/pass_rate_results/virtual_chatgpt_dfs/G1_instruction_virtual_chatgpt_dfs.json). I found the reproduction data zip file you provide, but I would like to see the results of evaluating this data as well. Thank you!