-
Notifications
You must be signed in to change notification settings - Fork 0
Upgrade uax29 for Unicode 16 support #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR upgrades the uax29 dependency from v2.3.0 to v2.4.0 to add Unicode 16 support for grapheme cluster segmentation. The PR description indicates behavior should remain unchanged except for text containing Indic_Conjunct_Break characters, which will be handled more correctly.
Changes:
- Upgraded
github.com/clipperhouse/uax29/v2from v2.3.0 to v2.4.0 in the main module - Updated the same dependency in the comparison module for compatibility testing
Reviewed changes
Copilot reviewed 2 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| go.mod | Updated uax29 dependency version to v2.4.0 in main module |
| go.sum | Updated checksums for uax29 v2.4.0 in main module |
| comparison/go.mod | Updated uax29 dependency version to v2.4.0 in comparison module |
| comparison/go.sum | Updated checksums for uax29 v2.4.0 in comparison module |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
From Claude Opus 4.5 “Is it correct that they should be width 1? What do text renderers and terminals do? Research.” Based on my research, yes, Indic scripts should be width 1. Here's the evidence: |
|
Further from Claude Opus 4.5 "What does Neutral mean, in terms of guidance for width?” Here's the key guidance from Unicode UAX #11 on Neutral characters:
|
PR Review (Claude Opus 4.5)Verdict: Approve ✓ This is a clean dependency upgrade with excellent test coverage. Summary
Strengths
No bugs foundThe research in the PR comments correctly explains the Unicode specification. The implementation is sound. |
Behavior should be unchanged. Text that includes Indic_Conjunct_Break may be different (and more correct).