Explorer API collects feedback from agent applications #295

pan-x-c · 2025-09-22T12:24:02Z

Description

This PR implements a prototype of Trinity-RFT online mode.

flowchart LR
    A[User] <--> B[Agent Runner]
    B <--> |chat| C[Explorer]
    B --> |feedback| C
    B --> |commit| C
    C --> |experience| D[Trainer]
    D --> |checkpoint| C

Limitations to be addressed in the future：

Only support algorithms without group execution (e.g., ppo, reinforceplusplus)
Low explorer service throughput
Stream chat is not supported
Synchronous database interface

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist · 2025-09-22T12:24:19Z

Summary of Changes

Hello @pan-x-c, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust mechanism for collecting feedback from agent applications interacting with the Explorer API. It establishes a session-based system where experiences generated during a session can be linked and later attributed with a reward, enabling more sophisticated reinforcement learning feedback loops. The changes span the API endpoints, the core service logic for managing experiences, and the client-side integration for agents.

Highlights

Feedback Collection API Endpoints: Introduced new API endpoints in trinity/explorer/api/api.py: /allocate for generating a unique session ID and /feedback for receiving reward feedback associated with a specific session ID. The existing chat_completions endpoint was updated to accept an optional session_id.
Session-based Experience Management: The ExplorerService in trinity/explorer/api/service.py was enhanced to manage experiences on a per-session basis. It now includes a session_level_experience_queue to temporarily store experiences linked to a session until feedback is received. A new allocate_session method generates session IDs, and record_feedback applies rewards to session-specific experiences before moving them to the general experience queue.
API Client Updates: The ExplorerClient in trinity/explorer/explorer_client.py was refactored to better handle API URLs and integrate with the new feedback mechanism. It now initializes a session upon creation and provides methods (feedback, feedback_async) to send reward feedback to the Explorer API, associating it with the client's session ID.
Concurrency and Type Hinting Improvements: Added an asyncio.Lock (queue_lock) to ExplorerService to ensure thread-safe access to experience queues. Several asynchronous methods in ExplorerService were updated with explicit -> None return type hints for clarity.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a mechanism for collecting feedback from agent applications through new API endpoints. It adds session management to associate experiences with feedback. The overall implementation is sound, but I've identified a critical bug in an API call and a high-severity issue related to the lack of input validation, which could lead to server errors. My review includes suggestions to fix these issues.

trinity/explorer/api/api.py

pan-x-c · 2025-09-22T12:33:56Z

/unittest-all

github-actions · 2025-09-22T13:22:32Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
142	141	0	1	0	0	2.8s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestTrainerMultiModal::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_duplicate_grpo	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_advantage	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_correct_bias	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_reward_std	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_advantage	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_gspo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss	✅	1ms
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	11ms
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	✅	3ms
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	✅	5ms
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	✅	5ms
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	2ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	1ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	7ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	3ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	4ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	1ms
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_buffer_read_write	✅	3ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0	✅	1ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1	✅	2ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2	✅	1ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3	✅	2ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4	✅	1ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5	✅	3ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command	✅	7ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc	✅	1ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command	✅	1ms
tests/cli/launcher_test.py::TestLauncherMain::test_multi_stage_run	✅	1ms
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	12ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	1ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	4ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	55ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	35ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	45ms
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	21ms
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	20ms
tests/common/vllm_test.py::TestAPIServer::test_api	✅	24ms
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	24ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	1ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	21ms
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	20ms
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	59ms
tests/explorer/explorer_test.py::TestExplorerCountdownNoEval::test_explorer	✅	47ms
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	204ms
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	66ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	3ms
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	22ms
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	14ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	7ms
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	7ms
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	5ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	7ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	13ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable	✅	1ms
tests/manager/synchronizer_test.py::TestSynchronizerExit::test_synchronizer	✅	29ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_0::test_synchronizer	✅	75ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_1::test_synchronizer	✅	72ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_2::test_synchronizer	✅	116ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_3::test_synchronizer	✅	132ms
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_0::test_synchronizer	✅	70ms
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_1::test_synchronizer	✅	70ms
tests/service/data_juicer_test.py::TestDataJuicer::test_config	✅	1ms
tests/service/data_juicer_test.py::TestDataJuicer::test_server_start	✅	22ms
tests/service/data_juicer_test.py::TestDataJuicerExperiencePipeline::test_data_juicer_operators	✅	21ms
tests/service/data_juicer_test.py::TestDataJuicerTaskPipeline::test_data_juicer_task_pipeline	✅	14ms
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	153ms
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	317ms
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	58ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	52ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	55ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	57ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	61ms
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	102ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	39ms
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	36ms
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	36ms
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	81ms
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	77ms
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	171ms
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	56ms
tests/trainer/trainer_test.py::TestTrainerMultiModal::test_trainer	⏭️	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_ground_truth	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_solution_string	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_multiple_boxed_answers_in_solution	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_not_boxed	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_raw_and_ground_truth_boxed_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_extract_answer	✅	1ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_verify_math_answer	✅	1ms
tests/utils/eval_utils_test.py::TestEvalUtils::test_is_equiv	✅	1ms
tests/utils/log_test.py::LogTest::test_actor_log	✅	2ms
tests/utils/log_test.py::LogTest::test_group_by_node	✅	2ms
tests/utils/log_test.py::LogTest::test_no_actor_log	✅	1ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local	✅	1ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote	✅	6ms
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class	✅	3ms

Github Test Reporter by CTRF 💚

pan-x-c · 2025-10-15T10:05:10Z

/unittest-all

pan-x-c · 2025-10-21T07:05:02Z

/unittest-all

pan-x-c · 2025-12-11T06:20:56Z

/unittest-all

pan-x-c · 2025-12-16T08:32:51Z

/unittest-module-trainer

github-actions · 2025-12-16T09:18:08Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
23	20	1	2	0	0	43m 6s

Failed Tests

Failed Tests ❌	Fail Message
❌ tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	The test failed in the call phase

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	3m 29s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	4m 49s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 27s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 20s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 25s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 23s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 34s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	2m 30s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	1m 2s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	58.5s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	57.8s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 54s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 55s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 39s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 19s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	4m 27s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	2m 33s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	❌	47ms
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	⏭️	809ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	⏭️	807ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 25s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 19s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	1m 11s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-19T02:22:36Z

/unittest-module-trainer

github-actions · 2025-12-19T03:03:29Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
23	21	0	2	0	0	38m 39s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	2m 50s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	4m 13s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 21s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 2s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 11s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 7s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 17s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	2m 8s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	49.3s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	45.3s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	46.4s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 34s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 34s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 18s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 6s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	4m 9s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	2m 5s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	✅	1m 42s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	⏭️	810ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	⏭️	809ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 19s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 5s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	53.6s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-19T03:28:07Z

/unittest-module-explorer

github-actions · 2025-12-19T03:43:16Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
47	47	0	0	0	0	13m 2s

Tests

Test Name	Status	Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	1m 23s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	1m 39s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	3m 30s
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	1m 6s
tests/explorer/proxy_test.py::RecorderTest::test_recorder	✅	58ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	8.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	8.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	16.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	24.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	8.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	9.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	8.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	8.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	9.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	8.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	12.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	18.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	16.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	12.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	28.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	11.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	17.5s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	13.9s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	36ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	24ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	483ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	4ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	13ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	8ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow	✅	129ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	14.3s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	14.4s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	✅	425ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	294ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	8.1s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai	✅	17.4s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-19T03:44:42Z

/unittest-module-buffer

github-actions · 2025-12-19T03:49:03Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
48	45	3	0	0	0	2m 7s

Failed Tests

Failed Tests ❌	Fail Message
❌ tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	The test failed in the call phase due to an exception
❌ tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	The test failed in the call phase due to an assertion error
❌ tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	The test failed in the call phase due to an assertion error

Tests

Test Name	Status	Duration
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	16.0s
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation	✅	7.3s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	❌	6.7s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	❌	4ms
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	❌	561ms
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	757ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	1.9s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	533ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	462ms
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter	✅	1.3s
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	1.0s
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	735ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	231ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	6.6s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	2.6s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control	✅	4.4s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	3.4s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	3.6s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	3.9s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration	✅	613ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	7ms
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_default_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_staleness_control_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_default_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_staleness_control_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_sql_staleness_control_sample_strategy	✅	4.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_default_sample_strategy	✅	1.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_staleness_control_sample_strategy	✅	1.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_default_sample_strategy	✅	2.4s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_staleness_control_sample_strategy	✅	1.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_sql_staleness_control_sample_strategy	✅	4.2s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_0	✅	6.0s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_1	✅	3.1s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write	✅	3.3s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0	✅	91ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1	✅	72ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2	✅	111ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3	✅	113ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4	✅	113ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5	✅	118ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6	✅	135ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple	✅	61ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file	✅	76ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql	✅	3.3s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file	✅	53ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql	✅	2.9s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file	✅	53ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql	✅	3.6s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-19T03:57:19Z

/unittest-module-buffer

github-actions · 2025-12-19T04:01:44Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
48	48	0	0	0	0	2m 14s

Tests

Test Name	Status	Duration
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	15.9s
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation	✅	7.2s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	✅	3.6s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	✅	4.9s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	✅	5.4s
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	761ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	1.9s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	532ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	458ms
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter	✅	1.3s
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	1.0s
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	727ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	229ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	6.5s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	2.4s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control	✅	4.6s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	3.3s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	3.6s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	4.0s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration	✅	814ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	7ms
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_default_sample_strategy	✅	1.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_staleness_control_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_default_sample_strategy	✅	1.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_staleness_control_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_sql_staleness_control_sample_strategy	✅	4.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_default_sample_strategy	✅	1.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_staleness_control_sample_strategy	✅	1.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_default_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_staleness_control_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_sql_staleness_control_sample_strategy	✅	4.2s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_0	✅	6.1s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_1	✅	3.0s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write	✅	3.5s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0	✅	91ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1	✅	72ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2	✅	111ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3	✅	112ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4	✅	112ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5	✅	116ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6	✅	134ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple	✅	58ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file	✅	73ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql	✅	3.1s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file	✅	53ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql	✅	3.3s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file	✅	52ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql	✅	3.8s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-19T05:48:14Z

/unittest-all

pan-x-c · 2025-12-22T03:18:02Z

/unittest-all

pan-x-c · 2025-12-22T05:58:43Z

/unittest-all

github-actions · 2025-12-22T07:15:17Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
228	225	1	2	0	0	1h 14m

Failed Tests

Failed Tests ❌	Fail Message
❌ tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	The test failed in the call phase due to an exception

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_std_grpo	✅	41ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_step_wise_grpo_advantage	✅	3ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_duplicate_grpo	✅	5ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_advantage	✅	3ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_correct_bias	✅	2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_reward_std	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_advantage	✅	2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_with_std_threshold	✅	2ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_abs_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_fallback	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_loss	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_same_policy	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_with_old_logprob	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_dummy_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k1_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k2_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k3_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_kl_loss_aggregation_modes	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_low_var_kl_fn	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss	✅	2ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_gspo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss	✅	3ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss_with_sequence_masking	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sapo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss	✅	1ms
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	16.6s
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation	✅	7.7s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	✅	3.7s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	✅	5.1s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	✅	5.9s
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	262ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	2.0s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	559ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	506ms
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter	✅	1.4s
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	1.1s
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	724ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	282ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	6.7s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	2.7s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control	✅	4.7s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	3.6s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	3.7s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	4.3s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration	✅	615ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	7ms
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_default_sample_strategy	✅	2.2s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_staleness_control_sample_strategy	✅	2.0s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_default_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_staleness_control_sample_strategy	✅	2.2s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_sql_staleness_control_sample_strategy	✅	4.9s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_default_sample_strategy	✅	2.2s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_staleness_control_sample_strategy	✅	2.1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_default_sample_strategy	✅	2.2s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_staleness_control_sample_strategy	✅	2.2s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_sql_staleness_control_sample_strategy	✅	4.6s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_0	✅	6.4s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_1	✅	3.2s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write	✅	4.1s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0	✅	91ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1	✅	73ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2	✅	112ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3	✅	115ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4	✅	113ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5	✅	119ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6	✅	135ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple	✅	59ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file	✅	75ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql	✅	3.4s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file	✅	54ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql	✅	3.4s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file	✅	58ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql	✅	4.1s
tests/cli/launcher_test.py::TestLauncherMain::test_debug_mode	✅	48.0s
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command	✅	7.5s
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc	✅	1.5s
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command	✅	319ms
tests/cli/launcher_test.py::TestLauncherMain::test_multi_stage_run	✅	1.8s
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	35.1s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	95ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	42ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	192ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	93ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	3.3s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	95ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	94ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	357ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	15ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	57.6s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	32.5s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	45.2s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	15.9s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	16.1s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	15.8s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	16.0s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	21.2s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	16.2s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	22.3s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	262ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	240ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	18.2s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	17.1s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	1m 47s
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	1m 4s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	1m 39s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	3m 34s
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	1m 8s
tests/explorer/proxy_test.py::RecorderTest::test_recorder	✅	76ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	9.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	8.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	17.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	24.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	8.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	9.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	9.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	9.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	9.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	9.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	13.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	18.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	16.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	12.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	29.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	12.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	18.4s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	13.8s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	2ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	604ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	2ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	16ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	24ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	268ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	4ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	16ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	10ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow	✅	119ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	14.5s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	15.6s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	✅	493ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	311ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	8.1s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai	✅	18.1s
tests/manager/synchronizer_test.py::TestSynchronizerExit::test_synchronizer	✅	44.5s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_0::test_synchronizer	✅	1m 26s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_1::test_synchronizer	✅	1m 32s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_2::test_synchronizer	✅	2m 13s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_3::test_synchronizer	✅	2m 16s
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_0::test_synchronizer	✅	1m 25s
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_1::test_synchronizer	✅	1m 24s
tests/service/data_juicer_test.py::TestDataJuicer::test_config	✅	2.3s
tests/service/data_juicer_test.py::TestDataJuicer::test_server_start	✅	22.0s
tests/service/data_juicer_test.py::TestDataJuicerExperiencePipeline::test_data_juicer_operators	✅	27.4s
tests/service/data_juicer_test.py::TestDataJuicerTaskPipeline::test_data_juicer_task_pipeline	✅	14.9s
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	2m 53s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	4m 53s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 14s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 7s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 4s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 8s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 20s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	2m 6s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	50.2s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	47.2s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	46.1s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 36s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 34s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 24s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 7s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	4m 1s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	2m 8s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	❌	1m
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	⏭️	811ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	⏭️	810ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 23s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 5s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	55.4s
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_equivalent	✅	15ms
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_ground_truth	✅	2ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_solution_string	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_multiple_boxed_answers_in_solution	✅	2ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_not_boxed	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_raw_and_ground_truth_boxed_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_extract_answer	✅	4ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_verify_math_answer	✅	72ms
tests/utils/eval_utils_test.py::TestEvalUtils::test_is_equiv	✅	6ms
tests/utils/log_test.py::LogTest::test_actor_log	✅	2.2s
tests/utils/log_test.py::LogTest::test_group_by_node	✅	2.1s
tests/utils/log_test.py::LogTest::test_no_actor_log	✅	906ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_0__workspace_tests_utils_plugins	✅	99ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_1_tests_utils_plugins	✅	95ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_0__workspace_tests_utils_plugins	✅	10.6s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_1_tests_utils_plugins	✅	10.7s
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_0__workspace_tests_utils_plugins	✅	6.2s
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_1_tests_utils_plugins	✅	6.0s
tests/utils/registry_test.py::TestRegistryWithRay::test_dynamic_import	✅	5.6s
tests/utils/registry_test.py::TestRegistry::test_algorithm_registry_mapping	✅	3ms
tests/utils/registry_test.py::TestRegistry::test_buffer_module_registry_mapping	✅	1ms
tests/utils/registry_test.py::TestRegistry::test_common_module_registry_mapping	✅	45ms
tests/utils/registry_test.py::TestRegistry::test_register_module	✅	1ms
tests/utils/registry_test.py::TestRegistry::test_utils_module_registry_mapping	✅	1ms

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-22T08:26:41Z

/unittest-module-trainer

github-actions · 2025-12-22T09:07:05Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
23	21	0	2	0	0	38m 33s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	2m 58s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	4m 11s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 12s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 3s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 7s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 11s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 21s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	2m 5s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	49.1s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	46.6s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	46.0s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 34s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 37s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 18s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 1s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	4m 2s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	2m 12s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	✅	1m 44s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	⏭️	811ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	⏭️	810ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 8s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 11s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	54.0s

Github Test Reporter by CTRF 💚

add allocate_session and feedback API

31c5693

gemini-code-assist bot reviewed Sep 22, 2025

View reviewed changes

trinity/explorer/api/api.py Outdated Show resolved Hide resolved

trinity/explorer/api/api.py Outdated Show resolved Hide resolved

fix comments

ece144e

pan-x-c added 5 commits September 23, 2025 10:58

add api test

7ed876c

merge main

bc3d72e

add tests

20857a7

process group for serve model

5295688

fix tests

fa6604a

pan-x-c changed the title ~~[WIP] Explorer API collects feedback from agent applications~~ Explorer API collects feedback from agent applications Oct 15, 2025

Merge branch 'main' into feature/user_feedback

1154605

pan-x-c added 2 commits October 15, 2025 19:43

clean code

ef7278e

Merge branch 'main' into feature/user_feedback

f401491

merge main

dc45cc0

fix serve tests

be94f13

pan-x-c added 8 commits December 16, 2025 17:20

fix tests

ce01983

Merge branch 'main' into feature/user_feedback

1692a23

add recorder

0552b8d

Merge branch 'main' into feature/user_feedback

fabd757

fix proxy

27eee07

fix server

7fa496c

fix serve trainer tset

02ce2dc

add tests

5507e5f

pan-x-c added 5 commits December 19, 2025 09:58

fix replay

4094490

fix synchronizer

eb63d9a

fix tests

3489b04

fix pre-commit

e8b0dd5

fix comments

c96593f

fix serve mode

f813128

pan-x-c added 2 commits December 19, 2025 11:56

fix buffer test

8101740

fix pre-commit

fe33799

fix megatron training

3df76a8

pan-x-c added 3 commits December 19, 2025 13:59

fix vllm prefix caching

63fa3be

fix benchmark

e0973a3

add client side timeout

ed8edef

fix comments

290e36f

update default test setting

c0b7b67

chenyushuo approved these changes Dec 23, 2025

View reviewed changes

pan-x-c merged commit 38ba481 into modelscope:main Dec 23, 2025
1 check passed

Explorer API collects feedback from agent applications #295

Explorer API collects feedback from agent applications #295

Uh oh!

Conversation

pan-x-c commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

gemini-code-assist bot commented Sep 22, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Sep 22, 2025

Uh oh!

github-actions bot commented Sep 22, 2025

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Oct 15, 2025

Uh oh!

pan-x-c commented Oct 21, 2025

Uh oh!

pan-x-c commented Dec 11, 2025

Uh oh!

pan-x-c commented Dec 16, 2025

Uh oh!

github-actions bot commented Dec 16, 2025

Summary

Failed Tests

Skipped

Tests

Uh oh!

pan-x-c commented Dec 19, 2025

Uh oh!

github-actions bot commented Dec 19, 2025

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Dec 19, 2025

Uh oh!

github-actions bot commented Dec 19, 2025

Summary

Tests

Uh oh!

pan-x-c commented Dec 19, 2025

Uh oh!

github-actions bot commented Dec 19, 2025

Summary

Failed Tests

Tests

Uh oh!

pan-x-c commented Dec 19, 2025

Uh oh!

github-actions bot commented Dec 19, 2025

Summary

Tests

Uh oh!

pan-x-c commented Dec 19, 2025

Uh oh!

pan-x-c commented Dec 22, 2025

Uh oh!

pan-x-c commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Summary

Failed Tests

Skipped

Tests

Uh oh!

pan-x-c commented Dec 22, 2025

pan-x-c commented Sep 22, 2025 •

edited

Loading