Skip to content

Conversation

@klei22
Copy link
Collaborator

@klei22 klei22 commented Feb 8, 2026

This pull request improves how whitespace is handled when writing output files in collect_matched_annotations.py, particularly to avoid stripping meaningful spaces from the param_nesting field. It also introduces a guardrail to detect accidental whitespace modifications, ensuring the integrity of the output.

Whitespace handling and output writing improvements:

  • Updated write_concat to avoid using .rstrip() (which strips spaces) and instead only trim trailing newlines, preserving meaningful spaces in param_nesting.
  • Refactored write_mc_inputs to use a new emit helper function that carefully strips only trailing newlines, and added a guardrail that raises an error if the output lengths for different fields do not match—helping catch accidental whitespace loss.

Other:

  • Minor formatting adjustment at the end of the file.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts output serialization in collect_matched_annotations.py to preserve meaningful trailing spaces (notably in param_nesting) while still normalizing trailing newlines, and adds a guardrail to detect unexpected whitespace/length divergence across the three aligned annotation streams.

Changes:

  • Updated write_concat to avoid .rstrip() and only remove trailing \n before writing.
  • Refactored write_mc_inputs to centralize newline-only trimming via an emit() helper.
  • Added a length-consistency guardrail across emitted mc_out, mc_pna, and mc_ga inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gkielian gkielian merged commit 0603da0 into ReaLLMASIC:master Feb 9, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants