Skip to content

Conversation

@Sarajir
Copy link
Collaborator

@Sarajir Sarajir commented Dec 20, 2025

  • Implement Nonogram puzzle task for VMEvalKit
  • Support 7 pattern types: cross, square, circle, checkerboard, letter_t, diagonal, random
  • Difficulty scaling: easy (5-6x6), medium (7-10x10), hard (12-15x15)
  • Ensure uniqueness and no empty rows/columns
  • Register task in TASK_CATALOG.py
  • Complete documentation in NONOGRAM.md

- Implement Nonogram puzzle task for VMEvalKit
- Support 7 pattern types: cross, square, circle, checkerboard, letter_t, diagonal, random
- Difficulty scaling: easy (5-6x6), medium (7-10x10), hard (12-15x15)
- Ensure uniqueness and no empty rows/columns
- Register task in TASK_CATALOG.py
- Complete documentation in NONOGRAM.md
Copy link
Collaborator

@LukeLIN-web LukeLIN-web left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it contains a lot of cases, please do a comprehensive test, check different cases.

"Keep the camera view fixed in the top-down perspective and maintain the grid structure unchanged. "
"Stop the video when all cells are correctly filled and the complete pattern is revealed."
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write a evaluator prompt

Sara(Ran) Ji added 2 commits December 19, 2025 22:47
- Add nonogram_task evaluation prompt to eval_prompt.py
- Fix missing comma after subway_pathfinding_task entry
- Add comprehensive test script for nonogram_task (test_nonogram.py)
- Resolve merge conflict in TASK_CATALOG.py (keep nonogram task)
- Keep both symmetry_completion_task and nonogram_task evaluation prompts
- Resolve merge conflict by including both entries
"tower_of_hanoi_task": "Check if exactly one disk moved between frames. Verify the move is legal (top disk moved to empty peg or larger disk). Compare final disk positions to expected.",
"symmetry_completion_task": "Verify that the right half of the grid in the final frame is correctly mirrored from the left half, creating a symmetric pattern. Check that all missing cells have been filled correctly to complete the vertical symmetry."
"symmetry_completion_task": "Verify that the right half of the grid in the final frame is correctly mirrored from the left half, creating a symmetric pattern. Check that all missing cells have been filled correctly to complete the vertical symmetry.",
"nonogram_task": "Verify that all cells in the final frame are correctly filled according to the row and column hints. Check that the filled cells match the expected pattern and that all hints are satisfied. Each row and column must have the correct sequence of filled blocks as indicated by the hints."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does these hints can be known by evaluator?

@@ -0,0 +1,357 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this file, I mean you need to generate dataset, and use video model to generate video, and check whether the prompts/ first image is reasonable. not create a test file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these test is useless, actually same with you run generate dataset

@hokindeng hokindeng merged commit 31a2e4c into dev Dec 30, 2025
@hokindeng hokindeng deleted the add_nonogram_task branch December 30, 2025 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants