Skip to content

Conversation

@flakey5
Copy link
Member

@flakey5 flakey5 commented Mar 6, 2025

Switches over to using the new doc generation tooling. For more background on this, please see #52343

Currently a draft just to get feedback on the approach to this integration.

cc @nodejs/web-infra


Notable Change info (by @avivkeller):

The Node.js Website and Website Infrastructure teams have introduced a brand-new documentation pipeline, modernizing how our API docs are generated. While the documentation site may look familiar today, this infrastructure we are hard at work making a completely refreshed user interface in the very near future!

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/nodejs-website
  • @nodejs/web-infra

@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. doc Issues and PRs related to the documentations. needs-ci PRs that need a full CI run. tools Issues and PRs related to the tools directory. windows Issues and PRs related to the Windows platform. labels Mar 6, 2025
@flakey5 flakey5 marked this pull request as draft March 6, 2025 06:24
@flakey5 flakey5 force-pushed the flakey5/20250305/api-docs-tooling branch from 77ede22 to 3423c21 Compare March 6, 2025 06:29
@flakey5 flakey5 force-pushed the flakey5/20250305/api-docs-tooling branch from 3423c21 to 451f8a7 Compare March 6, 2025 06:31
ovflowd

This comment was marked as outdated.

@flakey5 flakey5 force-pushed the flakey5/20250305/api-docs-tooling branch 3 times, most recently from cf2609b to a3ce99d Compare March 10, 2025 22:04
@flakey5 flakey5 marked this pull request as ready for review March 10, 2025 22:05
@flakey5

This comment was marked as resolved.

@flakey5

This comment was marked as resolved.

@ovflowd

This comment was marked as resolved.

@codecov
Copy link

codecov bot commented Mar 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.78%. Comparing base (1ad04e2) to head (3b3a9e2).
⚠️ Report is 75 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #57343      +/-   ##
==========================================
- Coverage   89.80%   89.78%   -0.03%     
==========================================
  Files         672      672              
  Lines      203907   203755     -152     
  Branches    39203    39167      -36     
==========================================
- Hits       183121   182940     -181     
- Misses      13113    13128      +15     
- Partials     7673     7687      +14     

see 38 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@araujogui

This comment was marked as resolved.

@araujogui

This comment was marked as resolved.

@ovflowd

This comment was marked as resolved.

Copy link
Member

@ovflowd ovflowd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the result of many months of arduous work between many awesome folks, including @flakey5 @AugustinMauroy @araujogui @ovflowd @avivkeller and others.

I'm so proud of what we are achieving here and this is a huge step towards a modern tooling and a revamped API docs within Node.js

Approving, as I believe this is ready!

@ovflowd

This comment was marked as resolved.

Copy link
Member

@lpinca lpinca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSLGTM because it is hard to review and outside of my comfort zone.

@ovflowd
Copy link
Member

ovflowd commented Jan 28, 2026

@ovflowd do you want to join? We're still in the meeting, in case you're available right now

I unfortunately gotta go to gym now 😅 -- I don't want to make a big fuss over this, just want to properly be able to have my concerns heard. I was able to explain my points to James, IRL (he's in Germany right now) and he completely understood them. So I feel quorom is needed. In the end of the day, if the TSC leans over "agreement with Antoine", I completely understand that and respect the decision 🙇

@avivkeller
Copy link
Member

I've added this to the Web Team agenda as well, so that we may discuss our stances on the matter more collectively.

@jasnell
Copy link
Member

jasnell commented Jan 29, 2026

I also had to miss the call as I'm in Munich for a company meeting. I also object to the decision and believe it needs more consideration and discussion with members of the web infra team present to offer counter arguments and discussion. At the very least @ovflowd and @flakey5 should have been explicitly invited to participate in this part of the discussion. I'm going to ask that we revisit this in the next TSC call that is in a timezone we can attend. /CC @nodejs/TSC

@mcollina
Copy link
Member

I also find it personally oblivious that the TSC meeting happens without any representation from the Web Infra side, which makes this a heavily one-sided argument.

To clarify: there was no question asked to the TSC in this issue, and there was a bit of context provided by @aduh95. There is still no question.

I was able to explain my points to James, IRL (he's in Germany right now) and he completely understood them.

I’m glad you did! But still leaves most of the @nodejs/tsc in the dark.

As this was presented:

  1. there is a breaking change in the output JSON format
  2. users are relying on that format for automation

The most natural thing to ask is if it could be made non-breaking. Land as semver-major if not. Given this id an undocumented feature of Node.js, you can ask the TSC to decide if this is covered by the semversiness guarantees.

Can you please formalize that argument?

@panva
Copy link
Member

panva commented Jan 29, 2026

I am not under the impression that these are breaking changes since this deals with documentation, not runtime. We change documentation all the time - we reorder html elements, add paragraphs, remove nodes, add change entries, add entire new pages, you name it. Why would the api.json need to be treated differently, just because it may be consumed programmatically?

As @mcollina said, there was no ask to the TSC presented here, a label was added without one, there's a blocking review from @aduh95 that the authors have yet to engage with. The actual changes are not clear to me but if the test/doctool/test-doc-api-json.mjs diff is any indication of those changes they are not out of line with how I expect documentation to be treated.

@aduh95
Copy link
Contributor

aduh95 commented Jan 29, 2026

Why would the api.json need to be treated differently, just because it may be consumed programmatically?

Yes exactly, and also such change should be introduced separately, so collaborators can review whether it is indeed an improvement, and whether it has potential to break the ecosystem. I would add that the current state makes it needlessly hard to review because the generated JSON is not human readable. In #57343 (comment), @ovflowd claimed the JSON structure would be a 1:1 match, so I don't think it's an unrealistic expectation to start with not breaking the existing tests (albeit, test-doc-api-json.mjs was added long after this PR was opened, with the explicit goal of detecting potential breaking changes).

FWIW I really think that we should make this PR backportable, otherwise we would make our lifes harder if we have to maintain two very different toolings in LTS branches vs main.

@ovflowd
Copy link
Member

ovflowd commented Jan 29, 2026

I’m glad you did! But still leaves most of the @nodejs/tsc in the dark.

Yeah, my bad, that's on us. I just assumed that we would have the call together and explain all things there, this was clearly miscommunication from our side, for that, Im sorry!

@ovflowd
Copy link
Member

ovflowd commented Jan 29, 2026

FWIW I really think that we should make this PR backportable, otherwise we would make our lifes harder if we have to maintain two very different toolings in LTS branches vs main.

I think there is merit for you to explain what exactly is not backported right now or breaking right now or actually provide information on why these test failures are actually breaking changes on a test suite that was introduced a few days ago for algo that was never covered in tests. Sure, we agreed that adding tests on nodejs/node was a pre-requisite to land this PR, but I never imagined it would also be used as a forceful baseline, those tests are not absolute and their failures don't tell the whole story 😅

@ovflowd
Copy link
Member

ovflowd commented Jan 29, 2026

To be fair, not trying to argue Antoine for the sake of it, but I believe we need to properly present our points (both sides, and back our claims), and how we would address these issues. Because what is being asked from Antoine is that we change the existing legacy tooling so that its JSON output becomes 1:1 of doc-kit... And then we can replace it to doc-kit and that just doesn't make sense to me.

If the concern is the fact that the output is not 1:1 character by character would be an issue then regardless of it being doc-kit or not.

Plus the amount of effort to update the legacy tooling which has very legacy code, has the risks of breaking even more things that are probably not covered. It is just a high risk, low reward and high effort ask... just to then be immediately replaced by doc-kit.

@aduh95
Copy link
Contributor

aduh95 commented Jan 29, 2026

@ovflowd if you want to make a list of the breaking changes introduced in this PR and the justification for each, reviewers would have the needed context for assessing if it's OK to land and to backport. IMO it would be more productive to try to remove those breaking changes first (you claim it's hard to do, that doesn't reflect well on the flexibility of doc-kit, but maybe I'm missing some context), as there would be no need to assess the potential breakage.

@avivkeller
Copy link
Member

Hey everyone! I spent the time a few weeks ago and went through every git diff change of the documentation JSON, and every change that I noted was expected (on my end).

While I understand that isn't much reassurance, I'll gladly spend more time documenting each breaking change to give y'all the "needed context" of why the tests needed to be changed.

There is still no question.

That's my fault. My understanding was that if reviewers and PR authors could not reach an agreement, TSC escalation was required, but I should've formally explained the concerns.

The argument (on behalf of the web-infra side of this PR) for the TSC is: To what extent does this change need to be 1:1 with the old documentation? Currently, the structure is mostly the same, but several bugs in the old documentation RegExes and parses were squashed. Thus, the JSON is a more accurate representation of the markdown, but breaks the tests for being slightly different.

you claim it's hard to do, that doesn't reflect well on the flexibility of doc-kit, but maybe I'm missing some context

The reason why it's hard for us to make this exactly 1:1, is because it requires reverting those bug fixes, which would then break all the other generators we've spent many months working on. (i.e. making the JSON 1:1 will make the HTML not 1:1, since it relies on those bugs being non existent)

If this PR must be semver-major, I'm perfectly fine with that. Yes, it would require maintaining the legacy tooling for a few more months, but if it would resolve all the concerns, I think it's a step in the right direction.

list of the breaking changes

I'm not in front of an API git diff at the moment, so this list is definitely not exhaustive, but off the top of my head, breaking changes (which are mostly bug fixes) include:

  • More accurate sub-header splitting (hence, the part of the test that checks synopsis.md changed)
  • optional status identification (hence, the part of the test that checks keys needed to add optional)
  • the new api top-level property (hence, the part of the test that checks keys needed to add api)
  • more accurate identification of stability alert boxes (hence, the part of the test that identifies deprecated changes needed to increase the expected count)
  • JSON is minified (this is easy to revert; hence, the newline check was removed)

@ovflowd

This comment has been minimized.

@ovflowd

This comment has been minimized.

@jasnell
Copy link
Member

jasnell commented Jan 29, 2026

Let's take a break on the conversation here and take a breath. We'll surface the conversation again on the TSC agenda and we'll make sure that the web infra team is represented in that conversation this time.

@jasnell
Copy link
Member

jasnell commented Jan 29, 2026

Can you please formalize that argument?

Let's do this in the TSC call. I think it's best for this PR discussion to take a cool off period. Folks are getting too frustrated.

@joyeecheung
Copy link
Member

joyeecheung commented Jan 29, 2026

My 2 cents: it seems @ovflowd and @aduh95 agree that #57343 (comment) is a path forward. In that case I'd say the work can continue on GitHub. The TSC meeting is not set up for making decisions, it's mostly a venue for FYI to TSC members because it's otherwise hard to get everyone's attention or test water temperatures quickly. (Case in point - I was not in the last meeting and I generally do not prioritize the TSC meeting in my schedule. I either join the meeting for that FYI when I have some free time, or check the agenda for FYI which serves equally for what I think the TSC meeting serves. I think I am not the only one that treats the TSC meeting this way given the low turnout of the TSC meetings in general).

If we can agree on "we don't necessarily need the JSON output to be the same, just that changes to behaviors verified by tests should be minimized and if not, some detailed explanation about what's being changed here is warranted. The doc-kit contributors are looking into either minimizing or explaining the test changes.", then this one paragraph summary may be enough for the next TSC meeting.

@avivkeller
Copy link
Member

avivkeller commented Jan 29, 2026

const allExpectedKeys = new Set([
   'added',
+  'api',
   'changes',
   'classes',
   'classMethods',
...
   'miscs',
   'modules',
   'name',
+  'optional',
   'napiVersion',
   'options',
   'params',

The new tooling adds an api top-level property, identical to the name of the file, i.e. fs or async_hooks. This property, while added, does not change the otherwise older structure of the JSON.

The new tooling also adds an optional property, denoting whether a property is optional (i.e. many options properties), such as filehandle.createWriteStream([options]).


-  assert.strictEqual(fileContent.split('\n', 3).length, 3);

The new tooling minifies JSON.


   if (dirent.name !== 'index.md') {
     assert.ok(json.introduced_in || Object.values(json).at(-1)?.[0].introduced_in);
+    assert.partialDeepStrictEqual(allExpectedKeys, findAllKeys(json));
   }
-  assert.partialDeepStrictEqual(allExpectedKeys, findAllKeys(json));

The new tooling generates a complete JSON structure for index.md (i.e. https://nodejs-api-docs-tooling.vercel.app/index.json). The old tooling generates an almost empty JSON. While this new structure is valid, it does not satisfy the requirements in findAllKeys, as this markdown file isn't really a module API docs (and as such, has blank properties [i.e. textRaw] which would be filled in a typical document).


-  assert.deepStrictEqual(Object.keys(json), ['type', 'source', ...({
+
+  assert.deepStrictEqual(Object.keys(json).sort(), ['type', 'api', 'source', ...({
     'addons.md': ['introduced_in', 'miscs'],
     'cli.md': ['introduced_in', 'miscs'],
     'debugger.md': ['introduced_in', 'stability', 'stabilityText', 'miscs'],
...
     'errors.md': ['introduced_in', 'classes', 'miscs'],
     'esm.md': ['introduced_in', 'meta', 'stability', 'stabilityText', 'properties', 'miscs'],
     'globals.md': ['introduced_in', 'stability', 'stabilityText', 'classes', 'methods', 'miscs'],
-    'index.md': [],
     'intl.md': ['introduced_in', 'miscs'],
     'n-api.md': ['introduced_in', 'stability', 'stabilityText', 'miscs'],
     'packages.md': ['introduced_in', 'meta', 'miscs'],
     'process.md': ['globals'],
+    'synopsis.md': ['introduced_in', 'miscs'],
     'report.md': ['introduced_in', 'stability', 'stabilityText', 'meta', 'miscs'],
-  }[dirent.name] ?? ['modules'])]);
+  }[dirent.name] ?? ['modules'])].sort());
 }

See aforementioned comment on why index.md has the modules section.

For synopsis.md, the new tooling places "Example" in a separate entry than "Usage and Example", and thus, has additional properties.


-assert.strictEqual(numberOfDeprecatedSections, 39); // Increase this number every time a new API is deprecated.
+assert.strictEqual(numberOfDeprecatedSections, 44); // Increase this number every time a new API is deprecated.

Finally, the number of deprecated sections. The new tooling correctly identifies many methods that were previously ignored by the legacy tooling, including deprecated ones (shown below):

{
  "name": "Buffer",
  "textRaw": "`new Buffer(array)`"
}
{
  "name": "Buffer",
  "textRaw": "`new Buffer(arrayBuffer[, byteOffset[, length]])`"
}
{
  "name": "Buffer",
  "textRaw": "`new Buffer(buffer)`"
}
{
  "name": "Buffer",
  "textRaw": "`new Buffer(size)`"
}
{
  "name": "Buffer",
  "textRaw": "`new Buffer(string[, encoding])`"
}

@ovflowd
Copy link
Member

ovflowd commented Jan 31, 2026

Hey @aduh95 — quick ping: are your concerns about the JSON changes addressed with the current iteration of this PR, or is there anything still missing that we should adjust?

@aduh95
Copy link
Contributor

aduh95 commented Jan 31, 2026

Thanks @avivkeller for the explainer! IIUC, it should be relatively easy to disable:

  • the additional keys (api, optional) (we can filter them out)
  • the JSON minification
  • the index.json changes (the current version can be hard-coded for now)

Let's take those to dedicated PRs.

It's a bit problematic that we have to use two .sort() calls because the test no longer guarantees the order is consistent, can this be reverted?
The change for numberOfDeprecatedSections sounds reasonable.

@avivkeller
Copy link
Member

  • the additional keys (api, optional) (we can filter them out)

This isn't 1:1, sure, but it's certainly not a breaking change. A change that adds a new feature (or in this case, JSON key) shouldn't be considered breaking. The optional status of certain keys helps generate other output formats (i.e. the new generator creates a signature which relies on knowing whether a key is optional), and the api output is used internally to know where to output (${api}.json).

I disagree on the requirement that this change be reverted, it's not problematic (as I see it).

  • the JSON minification

Resolved. cc @ovflowd @flakey5 to update this PR. In a follow-up, we would likely add a --minify / --no-minify flag to our tooling (maybe).

  • the index.json changes (the current version can be hard-coded for now)

I don't think this matters, for a similar reason mentioned in my response to your first bullet. The additional modules key of this JSON isn't a breaking change. Users not parsing index.json because it empty will continue to not parse it, will they not?

It's a bit problematic that we have to use two .sort() calls because the test no longer guarantees the order is consistent, can this be reverted?

Does the consistency of the JSON order really matter? Per the JSON specification, "[a]n object is an unordered set of name/value pairs". Thus, the order shouldn't matter.

@aduh95
Copy link
Contributor

aduh95 commented Feb 3, 2026

  • the additional keys (api, optional) (we can filter them out)

This isn't 1:1, sure, but it's certainly not a breaking change. A change that adds a new feature (or in this case, JSON key) shouldn't be considered breaking. The optional status of certain keys helps generate other output formats (i.e. the new generator creates a signature which relies on knowing whether a key is optional), and the api output is used internally to know where to output (${api}.json).

I disagree on the requirement that this change be reverted, it's not problematic (as I see it).

It's not about the change being breaking, it's about giving Node.js collaborators the opportunity to weigh in on it – like we do for any other change. It also makes it easier for backporting ro reverting if need be.

Does the consistency of the JSON order really matter? Per the JSON specification, "[a]n object is an unordered set of name/value pairs". Thus, the order shouldn't matter.

Tell that to V8, clearly it does matter otherwise you wouldn't have needed to add that .sort()

@ovflowd
Copy link
Member

ovflowd commented Feb 3, 2026

It also makes it easier for backporting ro reverting if need be.

With the fear of being pedantic here, what would really get backported here? There's nothing to be backported because these changes aren't breaking changes, both outputs from the old tooling and new tooling are completely compatible for anything consuming it.

Edit: What I meant here is that, there's no effort of actual backporting needed here as these changes are supposed to be backwards compatible. So these changes are safe to be backported without requiring outside/external or internal patches.

Tell that to V8, clearly it does matter otherwise you wouldn't have needed to add that .sort()

It matters because you structured you test to grab the keys in said order. That is an array, array are ordered. JSON objects aren't ordered. JavaScript objects "are", but we're talking about JSON here, not JavaScript.

@avivkeller avivkeller added the notable-change PRs with changes that should be highlighted in changelogs. label Feb 3, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

The notable-change PRs with changes that should be highlighted in changelogs. label has been added by @avivkeller.

Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section.

@avivkeller
Copy link
Member

(I've added notable-change PRs with changes that should be highlighted in changelogs. , cause it'd be cool to highlight this effort)

@jasnell
Copy link
Member

jasnell commented Feb 3, 2026

I've started working through the bits and pieces here and the conversation, working on trying to identify a compromise position that can allow this to proceed forward.

Here's where I'm landing currently but it might evolve a bit as I dig in further:

  • Let's land this as a semver-major but modify it so that it is only used for new releases on current and new release lines. Existing release lines (e.g. 22.x, 24.x) would not initially use the updated format
  • I agree with @avivkeller @ovflowd and @flakey5's assessments that the addition of the new fields should not be considered breaking... as acknowledged they are additive and are something that can be filtered out. I'm currently not seeing a justification for why those should be deferred to a separate PR
  • I agree the JSON minification should be optional / behind a toggle

I get there are nuances in this that I probably don't yet fully have my head around so I'm continuing to catch up. Will comment further once I feel confident that I'm adequately up to speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Issues and PRs related to build files or the CI. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. doc Issues and PRs related to the documentations. needs-ci PRs that need a full CI run. notable-change PRs with changes that should be highlighted in changelogs. tools Issues and PRs related to the tools directory. tsc-agenda Issues and PRs to discuss during the meetings of the TSC. web-agenda

Projects

None yet

Development

Successfully merging this pull request may close these issues.