-
Notifications
You must be signed in to change notification settings - Fork 18
feat(ototoy): add provider for ototoy.jp #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I haven't looked closer yet, but for the DOM parser I had an eye on https://jsr.io/@b-fuze/deno-dom. It is TS/Deno first, almost dependency-free and should be more lightweight, but I haven't used it for anything so far. |
|
Yeah, I also saw |
|
Seems to work fine, only difference is it doesn't have specific element type definitions so some field accesses had to be replaced with |
|
I'm sorry for the delay, I had really hoped to review the provider this week, but it will probably take me until next week. |
|
All good, take your time :) |
kellnerd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing the first steps with DOM parsing in Harmony!
I was worried that it might be very slow, but in my few measurements (using performance.now()) it turned out that fetching the HTML is an order of magnitude slower than parsing and scraping it.
(Example for the Beatles Anthology test case: 323ms parsing / 375ms parsing+scraping of 4751ms total provider time. Smaller releases were more like 100ms of 1500ms.)
kellnerd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last thing and we are good to go.
providers/Ototoy/mod.ts
Outdated
|
|
||
| releaseElements.forEach((el) => { | ||
| const text = el.textContent.trim(); | ||
| console.log(text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I actually had that there cause I realized that every release used in the tests (except for The Beatles - Anthology Collection) only has an original release date. I wonder if that's the most common pattern.
I used jsdom for the scraping, but can switch to something else if you have a preferred library. I'm not sure what the best option is.
Unfortunately, there's no:
Also, there are technically URLs for tracks. For example the track for https://ototoy.jp/_/default/p/3016055 is available at https://ototoy.jp/opus/index.php/19511684, but it's just a redirect back to the album page, so I didn't figure it'd be worth adding.
closes #34