Skip to content

Conversation

@Fang-
Copy link
Contributor

@Fang- Fang- commented Sep 17, 2025

Rewrites most of the atom parsing and rendering logic to have parity with the hoon standard library. Restructures to split on render vs parse, rather than by aura. Expands unit test coverage.


For both rendering and parsing, we broadly rename and restructure the logic to follow functions from the hoon standard library, for more obvious behavioral parity and easier cross-referencing. We provide aliases for the common operations to avoid foisting hoon stdlib names on developers.

The main API change is that we have a single parse() and render() into which you pass the aura you want to operate on, rather than having a dedicated function per aura.

For parsing, we parse any string into a $coin, rather than specifying the aura to parse ahead of time. We disambiguate by the first character of the string, then fast-check validity using regex, and only then do the appropriate parsing step.

Includes bugfixes for @uw, @da and @q parsing and rendering, most of which relate to leading zeroes and zero values.

Does not yet include support for @if, @is or @r* or @dr auras. Not pressing, we don't use those anywhere in practice.

Draft because:

  • CI needs fixing.
  • We're awaiting feedback on the structure/API from client devs. (See deps: use new aura-js tloncorp/tlon-apps#5134.)
  • Some TODOs remain.
  • We may want to do a proper cleanup pass over the @p and @q modules, the only ones that have been largely untouched here.

Closes #6.

Fang- added 26 commits June 2, 2025 13:43
Reimplements parsing logic in src/parse.ts to match the parsing logic
for atom literals used in the hoon standard library.

Includes regexes for all supported atom formats for validity checking.
Relies on existing aura parsing logic for "complex" auras, even if that
logic isn't stdlib-compliant yet. (Notably, `@da` and `@q` fail some of
their tests due to this.)

Includes some tests. More thorough tests forthcoming.
Reimplements rendering logic in src/render.ts to match the rendering
logic for atoms used in the hoon standard library.

Includes some tests. `@q` and `@da` in particular continue to fail. More
thorough testing forthcoming.
...by reimplementing the formatUw function locally. Somehow this is
_slightly_ slower than formatUw(), but only by an amount measurable
across hundreds of thousands of invocations.
Neither `@tas` nor `@ta` explicitly sanitize, instead rely on the atom
already being "sane" for those auras. But in those cases, too, we need
to take special care: string encoding in atoms is utf-8, whereas js
strings are utf-16, so we must call a `TextDecoder`.

For `@t` we do need to sanitize the string. We lift logic previously
implemented in tloncorp/tlon-apps#3274 into the library, where it
belongs.
We weren't properly extracting escaped byte-sequences in cords. Now we
do. We add inverses for the utilities written into render.ts, and settle
on a maximally-performant `bytesToBigint` that leverages Node.js's
`Buffer` if it's available for ~10x speedup on larger amounts of bytes.
This brings it in line with the stdlib rendering, which ensures leading
zeroes for the hours, minutes and seconds, prints those conditionally,
and extracts & renders the sub-second precision accurately.
It should omit the leading byte if it's empty/zero.
`@c` is used to represent utf-32 codepoints. Its rendering is very
similar to the url-safe `@t` encoding used in most places.

This should have parity with the stdlib where it matters. There are some
deranged cases that will not render accurately, but those shouldn't
matter in practice. (Right?)
Support leading zeroes in timeslots, absence of time and/or subseconds,
and negative years (Before Christ).
For %blob, because we don't want to depend on the full nockjs, we
concede and say that %blobs in the aura-js contexts are the jams of
nouns, instead of the nouns themselves. Nockjs should probably provide
the "real" coin type and parser as a wrapper around aura-js's
implementation.
They all follow a regular pattern. Constructing them through a function
like this makes them more obvious.
We disable auras for which fuzzing doesn't make sense, or that are
known-unsupported.
Still not quite up to parity with the stdlib parsing, but should support
all sane cases.
Calling stringToCord() with an empty string will result in zero bytes
going into bytesToBigint(), but that should just give you 0n.
Here, too, with the caveat that for full Noun support in the type you
should refer to the nockjs wrapper.
We do this by making it stop doing "valid pat" checks, which are
inaccurate for the `@q` format. Instead, we catch exceptions when they
get thrown and return null for those cases.

At some point, we should more thoroughly refactor/clean up the p and q
modules, but for now we'll take a low-touch approach.
Export selectively, only exposing what we actually want outside callers
to use.

We continue exposing some odd utilities for the time being (preSig,
deSig primarily) but plan to eventually remove those.
Fang- added 3 commits October 3, 2025 23:22
These had gotten borked due to index.ts' export changes.
We had made it more lenient in e20bae9 because its failures were getting
caught by its callsites, but we should probably preserve the sane parts
of its original behavior for now.

We do remove one old test that was testing for non-stdlib-compliant
behavior.
Fang- added 5 commits October 3, 2025 23:44
Really just a thin wrapper around slaw...
As far as I can tell, this check has never run in this iteration of the
repo. Now that it is trying to run, it's failing for reasons of using
pnpm instead of npm, or some such thing. We don't care to spend much
time on it right now, so we rip the check out entirely.

We still keep it around as a script and dev dependency. For the record,
I did check the sizes from pre- and post-refactor. We have gone from
~4.7kb to ~6.6kb.
Adds support for floating-point auras, `@rs`, `@rd`, `@rh` and `@rq`.
This addition is sizeable, because we must match the exact behavior of
the hoon stdlib. Luckily, its behavior is standard, and we don't need
all of the stdlib functionality. We can make do with js-based
implementation of IEEE-754 floating point conversions.

To make our up-front workload lighter, we copy other people's homework.

For parsing, we copy and adapt `p.encodeFloat` from this (indeed rather
hellish) implementation by Jonas Raoni Soares Silva:
http://jsfromhell.com/classes/binary-parser

For rendering, we port Ryan Juckett's Dragon4 implementation. Truly a
sight for sore eyes!
https://www.ryanjuckett.com/printing-floating-point-numbers/

All of the internals are fully generic to whatever bitsizes you desire.
They haven't been tested with non-standard sizes though, and the library
doesn't expose direct access to them.

Includes tests for all happy-path cases. Remaining TODO for adding test
cases where the parser needs to truncate them down to values that fit
inside the floats.
Fang- added 8 commits October 17, 2025 11:37
These hadn't been given any treatment yet. The old library exposed a
myriad of convertors for various number-string formats, check helpers,
etc.

Here we strip most of those out, leaving only core parsing and rendering
logic in place, supplemented by simple validation logic and basic stdlib
utilities for `@p`.

For `@p` as well, we update its logic to work directly on the bigints.

The restructuring, removal of redundant checks, and tweaks to the
internals, give us a modest 10% performance improval on the combined
`@p` and `@q` fuzz tests.
For coherence, we move some code around. Out of separate utility files
and into the only files that import them.

We rename some functions for consistency.

We expose all the aura-specific utilities under objects named after
their respective auras.

We lightly improve doccomment consistency.
We no longer use this in our tests.
Fang- added a commit to tloncorp/tlon-apps that referenced this pull request Oct 17, 2025
Integrate the new aura-js introduced by urbit/aura-js#8. We try to keep
impact minimal by updating to its new api in-place as best we can. Notes
some points for further review.

Doesn't yet update the dependency pointer, because the relevant PR
hasn't been merged and released yet.
@Fang- Fang- marked this pull request as ready for review October 20, 2025 12:57
@Fang- Fang- requested a review from arthyn October 20, 2025 12:57
Can now generate values of arbitrary bit widths. We do this to make sure
we cover more varied inputs more consistently, and restrict fuzzing for
certain auras to inputs that make sense for that aura. (For example, no
inputs >32 bits for `@if`, because any additional bits would get
truncated, breaking the round-tripping that we test for.)

Additionally, we make it possible to fuzz floats by letting you specify
a "transform" function for the generated test values. This way, we can
ensure none of our inputs are `NaN` values, which would have their
mantissa ("payload") bits dropped during rendering.

Unfortunately, this does uncover an off-by-one in the parsing logic for
floats. In lieu of fixing that, we keep the float fuzzing disabled for
now...
Fang- added 4 commits October 25, 2025 22:28
Splits the yule() out of year() so this can reuse the same logic used
for `@da`. Makes some bignum conversions eager in the process.
Now that both parsing and rendering are in, we can add them to fuzz
tests.
It now deals with both `@da` and `@dr`, so should be named accordingly,
similar to r.ts.
Copy link
Contributor

@arthyn arthyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still lgtm

UIP-135 expands both parsing and rendering of `@da` literals to support
leading zeroes in the month and day segments.

By accepting leading zeroes in the parser, we become compatible with
backends both before and after the UIP-135 change. Eventually, we'll
want to switch to the "fixed-width" date rendering as well, but that
doesn't make sense for this library when those backend changes haven't
shipped yet.
@Fang- Fang- merged commit dd1335c into master Nov 18, 2025
6 checks passed
@Fang- Fang- deleted the m/re branch November 18, 2025 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants