lib: refactor and restructure #8

Fang- · 2025-09-17T12:00:18Z

Rewrites most of the atom parsing and rendering logic to have parity with the hoon standard library. Restructures to split on render vs parse, rather than by aura. Expands unit test coverage.

For both rendering and parsing, we broadly rename and restructure the logic to follow functions from the hoon standard library, for more obvious behavioral parity and easier cross-referencing. We provide aliases for the common operations to avoid foisting hoon stdlib names on developers.

The main API change is that we have a single parse() and render() into which you pass the aura you want to operate on, rather than having a dedicated function per aura.

For parsing, we parse any string into a $coin, rather than specifying the aura to parse ahead of time. We disambiguate by the first character of the string, then fast-check validity using regex, and only then do the appropriate parsing step.

Includes bugfixes for @uw, @da and @q parsing and rendering, most of which relate to leading zeroes and zero values.

Does not yet include support for @if, @is ~~or @r*~~ or @dr auras. Not pressing, we don't use those anywhere in practice.

Draft because:

CI needs fixing.
We're awaiting feedback on the structure/API from client devs. (See deps: use new aura-js tloncorp/tlon-apps#5134.)
Some TODOs remain.
We may want to do a proper cleanup pass over the @p and @q modules, the only ones that have been largely untouched here.

Closes #6.

Reimplements parsing logic in src/parse.ts to match the parsing logic for atom literals used in the hoon standard library. Includes regexes for all supported atom formats for validity checking. Relies on existing aura parsing logic for "complex" auras, even if that logic isn't stdlib-compliant yet. (Notably, `@da` and `@q` fail some of their tests due to this.) Includes some tests. More thorough tests forthcoming.

Reimplements rendering logic in src/render.ts to match the rendering logic for atoms used in the hoon standard library. Includes some tests. `@q` and `@da` in particular continue to fail. More thorough testing forthcoming.

...by reimplementing the formatUw function locally. Somehow this is _slightly_ slower than formatUw(), but only by an amount measurable across hundreds of thousands of invocations.

Neither `@tas` nor `@ta` explicitly sanitize, instead rely on the atom already being "sane" for those auras. But in those cases, too, we need to take special care: string encoding in atoms is utf-8, whereas js strings are utf-16, so we must call a `TextDecoder`. For `@t` we do need to sanitize the string. We lift logic previously implemented in tloncorp/tlon-apps#3274 into the library, where it belongs.

We weren't properly extracting escaped byte-sequences in cords. Now we do. We add inverses for the utilities written into render.ts, and settle on a maximally-performant `bytesToBigint` that leverages Node.js's `Buffer` if it's available for ~10x speedup on larger amounts of bytes.

This brings it in line with the stdlib rendering, which ensures leading zeroes for the hours, minutes and seconds, prints those conditionally, and extracts & renders the sub-second precision accurately.

It should omit the leading byte if it's empty/zero.

`@c` is used to represent utf-32 codepoints. Its rendering is very similar to the url-safe `@t` encoding used in most places. This should have parity with the stdlib where it matters. There are some deranged cases that will not render accurately, but those shouldn't matter in practice. (Right?)

Support leading zeroes in timeslots, absence of time and/or subseconds, and negative years (Before Christ).

For %blob, because we don't want to depend on the full nockjs, we concede and say that %blobs in the aura-js contexts are the jams of nouns, instead of the nouns themselves. Nockjs should probably provide the "real" coin type and parser as a wrapper around aura-js's implementation.

They all follow a regular pattern. Constructing them through a function like this makes them more obvious.

We disable auras for which fuzzing doesn't make sense, or that are known-unsupported.

Still not quite up to parity with the stdlib parsing, but should support all sane cases.

Calling stringToCord() with an empty string will result in zero bytes going into bytesToBigint(), but that should just give you 0n.

Here, too, with the caveat that for full Noun support in the type you should refer to the nockjs wrapper.

We do this by making it stop doing "valid pat" checks, which are inaccurate for the `@q` format. Instead, we catch exceptions when they get thrown and return null for those cases. At some point, we should more thoroughly refactor/clean up the p and q modules, but for now we'll take a low-touch approach.

Export selectively, only exposing what we actually want outside callers to use. We continue exposing some odd utilities for the time being (preSig, deSig primarily) but plan to eventually remove those.

These had gotten borked due to index.ts' export changes.

We had made it more lenient in e20bae9 because its failures were getting caught by its callsites, but we should probably preserve the sane parts of its original behavior for now. We do remove one old test that was testing for non-stdlib-compliant behavior.

Really just a thin wrapper around slaw...

As far as I can tell, this check has never run in this iteration of the repo. Now that it is trying to run, it's failing for reasons of using pnpm instead of npm, or some such thing. We don't care to spend much time on it right now, so we rip the check out entirely. We still keep it around as a script and dev dependency. For the record, I did check the sizes from pre- and post-refactor. We have gone from ~4.7kb to ~6.6kb.

Adds support for floating-point auras, `@rs`, `@rd`, `@rh` and `@rq`. This addition is sizeable, because we must match the exact behavior of the hoon stdlib. Luckily, its behavior is standard, and we don't need all of the stdlib functionality. We can make do with js-based implementation of IEEE-754 floating point conversions. To make our up-front workload lighter, we copy other people's homework. For parsing, we copy and adapt `p.encodeFloat` from this (indeed rather hellish) implementation by Jonas Raoni Soares Silva: http://jsfromhell.com/classes/binary-parser For rendering, we port Ryan Juckett's Dragon4 implementation. Truly a sight for sore eyes! https://www.ryanjuckett.com/printing-floating-point-numbers/ All of the internals are fully generic to whatever bitsizes you desire. They haven't been tested with non-standard sizes though, and the library doesn't expose direct access to them. Includes tests for all happy-path cases. Remaining TODO for adding test cases where the parser needs to truncate them down to values that fit inside the floats.

These hadn't been given any treatment yet. The old library exposed a myriad of convertors for various number-string formats, check helpers, etc. Here we strip most of those out, leaving only core parsing and rendering logic in place, supplemented by simple validation logic and basic stdlib utilities for `@p`. For `@p` as well, we update its logic to work directly on the bigints. The restructuring, removal of redundant checks, and tweaks to the internals, give us a modest 10% performance improval on the combined `@p` and `@q` fuzz tests.

For coherence, we move some code around. Out of separate utility files and into the only files that import them. We rename some functions for consistency. We expose all the aura-specific utilities under objects named after their respective auras. We lightly improve doccomment consistency.

We no longer use this in our tests.

Integrate the new aura-js introduced by urbit/aura-js#8. We try to keep impact minimal by updating to its new api in-place as best we can. Notes some points for further review. Doesn't yet update the dependency pointer, because the relevant PR hasn't been merged and released yet.

Can now generate values of arbitrary bit widths. We do this to make sure we cover more varied inputs more consistently, and restrict fuzzing for certain auras to inputs that make sense for that aura. (For example, no inputs >32 bits for `@if`, because any additional bits would get truncated, breaking the round-tripping that we test for.) Additionally, we make it possible to fuzz floats by letting you specify a "transform" function for the generated test values. This way, we can ensure none of our inputs are `NaN` values, which would have their mantissa ("payload") bits dropped during rendering. Unfortunately, this does uncover an off-by-one in the parsing logic for floats. In lieu of fixing that, we keep the float fuzzing disabled for now...

Splits the yule() out of year() so this can reuse the same logic used for `@da`. Makes some bignum conversions eager in the process.

Now that both parsing and rendering are in, we can add them to fuzz tests.

It now deals with both `@da` and `@dr`, so should be named accordingly, similar to r.ts.

arthyn

still lgtm

UIP-135 expands both parsing and rendering of `@da` literals to support leading zeroes in the month and day segments. By accepting leading zeroes in the parser, we become compatible with backends both before and after the UIP-135 change. Eventually, we'll want to switch to the "fixed-width" date rendering as well, but that doesn't make sense for this library when those backend changes haven't shipped yet.

Fang- added 26 commits June 2, 2025 13:43

tests: add @n and @f parsing tests

d50fb46

render: begin stdlib-compliant render refactor

4ee5073

Reimplements rendering logic in src/render.ts to match the rendering logic for atoms used in the hoon standard library. Includes some tests. `@q` and `@da` in particular continue to fail. More thorough testing forthcoming.

render: fix rendering zero @uw

b2e7b53

...by reimplementing the formatUw function locally. Somehow this is _slightly_ slower than formatUw(), but only by an amount measurable across hundreds of thousands of invocations.

render: fix @da rendering

4a4a480

This brings it in line with the stdlib rendering, which ensures leading zeroes for the hours, minutes and seconds, prints those conditionally, and extracts & renders the sub-second precision accurately.

render: fix @q rendering

a956bba

It should omit the leading byte if it's empty/zero.

parse: improved da support

6478dd0

Support leading zeroes in timeslots, absence of time and/or subseconds, and negative years (Before Christ).

parse: refactor integer regexes

42feb58

They all follow a regular pattern. Constructing them through a function like this makes them more obvious.

render: fix signed atom rendering, add tests

5219efb

parse: add signed atom tests

1aadd88

tests: add fuzz tests

1e30add

We disable auras for which fuzzing doesn't make sense, or that are known-unsupported.

parse: better utf-32 codepoint parsing support

3d2cac0

Still not quite up to parity with the stdlib parsing, but should support all sane cases.

parse: no bytes should give 0n bigint

c1187c0

Calling stringToCord() with an empty string will result in zero bytes going into bytesToBigint(), but that should just give you 0n.

render: support %blob coins

679f435

Here, too, with the caveat that for full Noun support in the type you should refer to the nockjs wrapper.

types: consolidate

f0122b2

tests: consolidate static test atoms

b99b9f0

various: minor cleanup

d41c615

tests: additional parse failure tests

0c58221

parse: local uv/uw parsing implementation

71ce7fa

various: remove unused files and utilities

fb88a4c

lib: refine exports

c57e071

Export selectively, only exposing what we actually want outside callers to use. We continue exposing some odd utilities for the time being (preSig, deSig primarily) but plan to eventually remove those.

Fang- mentioned this pull request Sep 17, 2025

deps: use new aura-js tloncorp/tlon-apps#5134

Merged

6 tasks

Fang- added 3 commits October 3, 2025 23:22

parse: add tryParse alias for slaw

85fa061

tests: fix imports

7e517f6

These had gotten borked due to index.ts' export changes.

Fang- added 5 commits October 3, 2025 23:44

parse: add valid() syntax tester

8ef8cc1

Really just a thin wrapper around slaw...

parse: comment touch-ups

e360468

tests: fuzz test portability (for ci)

65364f9

arthyn approved these changes Oct 16, 2025

View reviewed changes

Fang- added 8 commits October 17, 2025 11:37

lib: remove jsverify dependency

88785ea

We no longer use this in our tests.

lib: update export shape, readme

6f26b87

parse: unify r-parsing counter

4837b43

tests: correct long @q rendering

a642f98

tests: note about @r aura fuzzing

61a4c06

lib: consistently note @uc as unsupported

7b8eac8

Fang- added 2 commits October 19, 2025 23:31

parse: support @if, @is

b07fc48

render: support @if, @is

77c4a12

Fang- marked this pull request as ready for review October 20, 2025 12:57

Fang- requested a review from arthyn October 20, 2025 12:57

Fang- force-pushed the m/re branch from a6aecd0 to 359d11c Compare October 20, 2025 20:05

Fang- added 4 commits October 25, 2025 22:28

parse: support @dr

a2d272c

Splits the yule() out of year() so this can reuse the same logic used for `@da`. Makes some bignum conversions eager in the process.

render: support @dr

f9c0acf

Now that both parsing and rendering are in, we can add them to fuzz tests.

lib: add some @dr utilities

c7a03a0

lib: rename da.ts to d.ts

79bfaaf

It now deals with both `@da` and `@dr`, so should be named accordingly, similar to r.ts.

arthyn approved these changes Oct 29, 2025

View reviewed changes

Fang- merged commit dd1335c into master Nov 18, 2025
6 checks passed

Fang- deleted the m/re branch November 18, 2025 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

lib: refactor and restructure #8

lib: refactor and restructure #8

Uh oh!

Fang- commented Sep 17, 2025 •

edited

Loading

Uh oh!

arthyn left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lib: refactor and restructure #8

lib: refactor and restructure #8

Uh oh!

Conversation

Fang- commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arthyn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fang- commented Sep 17, 2025 •

edited

Loading