Skip to content

Conversation

@Maxr1998
Copy link
Contributor

@Maxr1998 Maxr1998 commented Jun 14, 2025

  • Correctly handle white space between word time tags
  • Support parsing time tags that surround each non-blank segment

Fixes #83.

@Maxr1998 Maxr1998 marked this pull request as ready for review June 18, 2025 12:51
@Maxr1998 Maxr1998 force-pushed the fix-lrc-parser branch 2 times, most recently from c223e2c to 3ebeb81 Compare June 18, 2025 13:47
Copy link
Member

@andy840119 andy840119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only review the first and 3rd commit.
Resolve the review feedback than I'll continuous review the rest of one. Need to make sure that those breaking change is correct.

Copy link
Member

@andy840119 andy840119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there are several changes within a PR:

  • Adjust test case.
  • Doing refactor and add more tests.
  • (breaking change) Deal with spacing within the lyric.
  • (breaking change) Adjust the index.
  • (breaking change but cancelled) not decode the lyric if not start with timing info.

As lrc has not strict rule definition, it's hard for me to decide lots of change within a PR.
Should be better to separate those changes with different PRs.

@Maxr1998 Maxr1998 changed the title Rework LRC parsing with word time tags Fix white space handling while parsing LRC with word time tags Jun 21, 2025
@Maxr1998 Maxr1998 force-pushed the fix-lrc-parser branch 2 times, most recently from 7a976d8 to c32a970 Compare June 21, 2025 17:40
@andy840119 andy840119 added the Breaking change This change might let parsing result not same as before. label Jun 22, 2025
@Maxr1998 Maxr1998 force-pushed the fix-lrc-parser branch 3 times, most recently from 94a1ac3 to 9e88207 Compare June 22, 2025 13:33
Copy link
Member

@andy840119 andy840119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

I haven't deeply review LrcParser/Parser/Lrc/Utils/LrcTimedTextUtils.cs but can give it a quick pass because the test case is much enough.

Comment on lines 31 to 73
[TestCase("<00:51.00><01:29.99><01:48.29><02:31.00><02:41.99>You gotta fight !", "You gotta fight !", new[] { "[0,start]:51000" })] // decode with invalid format.
public void TestDecodeWithInvalidFormat(string text, string expectedText, string[] expectedTimeTags)
{
var (actualText, actualTimeTags) = LrcTimedTextUtils.TimedTextToObject(text);
var (actualText, actualTimeTags) = LrcTimedTextUtils.TimedTextToObject(text, lineStartTime);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have no idea why you remove this test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the one I moved above, since the parsing itself passes fine. Thus, I don't consider it invalid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

The old question:
does the lrc define those kinds of case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really 🥲
However, the approach I chose seems logical to me:
<00:51.00><01:29.99><01:48.29><02:31.00><02:41.99>You gotta fight !
There are four segments that have a time defined, but no content. Thus, they don't need to be extracted and will be skipped. The last segment isn't empty, and will have a start time of 02:41.99.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm...
I think should be OK now until someone have complain after.

- Correctly handle white space between word time tags
- Support parsing time tags that surround each non-blank segment
@andy840119 andy840119 merged commit ae8e5a1 into karaoke-dev:main Jun 23, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Breaking change This change might let parsing result not same as before. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Removing time tag leaves whitespace behind

2 participants