-
Notifications
You must be signed in to change notification settings - Fork 1
lib: refactor and restructure #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Reimplements parsing logic in src/parse.ts to match the parsing logic for atom literals used in the hoon standard library. Includes regexes for all supported atom formats for validity checking. Relies on existing aura parsing logic for "complex" auras, even if that logic isn't stdlib-compliant yet. (Notably, `@da` and `@q` fail some of their tests due to this.) Includes some tests. More thorough tests forthcoming.
Reimplements rendering logic in src/render.ts to match the rendering logic for atoms used in the hoon standard library. Includes some tests. `@q` and `@da` in particular continue to fail. More thorough testing forthcoming.
...by reimplementing the formatUw function locally. Somehow this is _slightly_ slower than formatUw(), but only by an amount measurable across hundreds of thousands of invocations.
Neither `@tas` nor `@ta` explicitly sanitize, instead rely on the atom already being "sane" for those auras. But in those cases, too, we need to take special care: string encoding in atoms is utf-8, whereas js strings are utf-16, so we must call a `TextDecoder`. For `@t` we do need to sanitize the string. We lift logic previously implemented in tloncorp/tlon-apps#3274 into the library, where it belongs.
We weren't properly extracting escaped byte-sequences in cords. Now we do. We add inverses for the utilities written into render.ts, and settle on a maximally-performant `bytesToBigint` that leverages Node.js's `Buffer` if it's available for ~10x speedup on larger amounts of bytes.
This brings it in line with the stdlib rendering, which ensures leading zeroes for the hours, minutes and seconds, prints those conditionally, and extracts & renders the sub-second precision accurately.
It should omit the leading byte if it's empty/zero.
`@c` is used to represent utf-32 codepoints. Its rendering is very similar to the url-safe `@t` encoding used in most places. This should have parity with the stdlib where it matters. There are some deranged cases that will not render accurately, but those shouldn't matter in practice. (Right?)
Support leading zeroes in timeslots, absence of time and/or subseconds, and negative years (Before Christ).
For %blob, because we don't want to depend on the full nockjs, we concede and say that %blobs in the aura-js contexts are the jams of nouns, instead of the nouns themselves. Nockjs should probably provide the "real" coin type and parser as a wrapper around aura-js's implementation.
They all follow a regular pattern. Constructing them through a function like this makes them more obvious.
We disable auras for which fuzzing doesn't make sense, or that are known-unsupported.
Still not quite up to parity with the stdlib parsing, but should support all sane cases.
Calling stringToCord() with an empty string will result in zero bytes going into bytesToBigint(), but that should just give you 0n.
Here, too, with the caveat that for full Noun support in the type you should refer to the nockjs wrapper.
We do this by making it stop doing "valid pat" checks, which are inaccurate for the `@q` format. Instead, we catch exceptions when they get thrown and return null for those cases. At some point, we should more thoroughly refactor/clean up the p and q modules, but for now we'll take a low-touch approach.
Export selectively, only exposing what we actually want outside callers to use. We continue exposing some odd utilities for the time being (preSig, deSig primarily) but plan to eventually remove those.
6 tasks
These had gotten borked due to index.ts' export changes.
We had made it more lenient in e20bae9 because its failures were getting caught by its callsites, but we should probably preserve the sane parts of its original behavior for now. We do remove one old test that was testing for non-stdlib-compliant behavior.
Really just a thin wrapper around slaw...
As far as I can tell, this check has never run in this iteration of the repo. Now that it is trying to run, it's failing for reasons of using pnpm instead of npm, or some such thing. We don't care to spend much time on it right now, so we rip the check out entirely. We still keep it around as a script and dev dependency. For the record, I did check the sizes from pre- and post-refactor. We have gone from ~4.7kb to ~6.6kb.
Adds support for floating-point auras, `@rs`, `@rd`, `@rh` and `@rq`. This addition is sizeable, because we must match the exact behavior of the hoon stdlib. Luckily, its behavior is standard, and we don't need all of the stdlib functionality. We can make do with js-based implementation of IEEE-754 floating point conversions. To make our up-front workload lighter, we copy other people's homework. For parsing, we copy and adapt `p.encodeFloat` from this (indeed rather hellish) implementation by Jonas Raoni Soares Silva: http://jsfromhell.com/classes/binary-parser For rendering, we port Ryan Juckett's Dragon4 implementation. Truly a sight for sore eyes! https://www.ryanjuckett.com/printing-floating-point-numbers/ All of the internals are fully generic to whatever bitsizes you desire. They haven't been tested with non-standard sizes though, and the library doesn't expose direct access to them. Includes tests for all happy-path cases. Remaining TODO for adding test cases where the parser needs to truncate them down to values that fit inside the floats.
arthyn
approved these changes
Oct 16, 2025
These hadn't been given any treatment yet. The old library exposed a myriad of convertors for various number-string formats, check helpers, etc. Here we strip most of those out, leaving only core parsing and rendering logic in place, supplemented by simple validation logic and basic stdlib utilities for `@p`. For `@p` as well, we update its logic to work directly on the bigints. The restructuring, removal of redundant checks, and tweaks to the internals, give us a modest 10% performance improval on the combined `@p` and `@q` fuzz tests.
For coherence, we move some code around. Out of separate utility files and into the only files that import them. We rename some functions for consistency. We expose all the aura-specific utilities under objects named after their respective auras. We lightly improve doccomment consistency.
We no longer use this in our tests.
Fang-
added a commit
to tloncorp/tlon-apps
that referenced
this pull request
Oct 17, 2025
Integrate the new aura-js introduced by urbit/aura-js#8. We try to keep impact minimal by updating to its new api in-place as best we can. Notes some points for further review. Doesn't yet update the dependency pointer, because the relevant PR hasn't been merged and released yet.
Can now generate values of arbitrary bit widths. We do this to make sure
we cover more varied inputs more consistently, and restrict fuzzing for
certain auras to inputs that make sense for that aura. (For example, no
inputs >32 bits for `@if`, because any additional bits would get
truncated, breaking the round-tripping that we test for.)
Additionally, we make it possible to fuzz floats by letting you specify
a "transform" function for the generated test values. This way, we can
ensure none of our inputs are `NaN` values, which would have their
mantissa ("payload") bits dropped during rendering.
Unfortunately, this does uncover an off-by-one in the parsing logic for
floats. In lieu of fixing that, we keep the float fuzzing disabled for
now...
Splits the yule() out of year() so this can reuse the same logic used for `@da`. Makes some bignum conversions eager in the process.
Now that both parsing and rendering are in, we can add them to fuzz tests.
It now deals with both `@da` and `@dr`, so should be named accordingly, similar to r.ts.
arthyn
approved these changes
Oct 29, 2025
Contributor
arthyn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still lgtm
UIP-135 expands both parsing and rendering of `@da` literals to support leading zeroes in the month and day segments. By accepting leading zeroes in the parser, we become compatible with backends both before and after the UIP-135 change. Eventually, we'll want to switch to the "fixed-width" date rendering as well, but that doesn't make sense for this library when those backend changes haven't shipped yet.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rewrites most of the atom parsing and rendering logic to have parity with the hoon standard library. Restructures to split on render vs parse, rather than by aura. Expands unit test coverage.
For both rendering and parsing, we broadly rename and restructure the logic to follow functions from the hoon standard library, for more obvious behavioral parity and easier cross-referencing. We provide aliases for the common operations to avoid foisting hoon stdlib names on developers.
The main API change is that we have a single
parse()andrender()into which you pass the aura you want to operate on, rather than having a dedicated function per aura.For parsing, we parse any string into a
$coin, rather than specifying the aura to parse ahead of time. We disambiguate by the first character of the string, then fast-check validity using regex, and only then do the appropriate parsing step.Includes bugfixes for
@uw,@daand@qparsing and rendering, most of which relate to leading zeroes and zero values.Does not yet include support for
@if,@isoror@r*@drauras. Not pressing, we don't use those anywhere in practice.Draft because:
@pand@qmodules, the only ones that have been largely untouched here.Closes #6.