Skip to content

Conversation

@naugtur
Copy link
Member

@naugtur naugtur commented Dec 17, 2025

Description

With compartment-mapper introducing support for dynamic imports, the old regexp-based ways in which we evaded import censorship in SES are no longer viable and AST based transform can be the only meaning-preserving one.

Open questions

  • what about import ( - with whitespace? We could add complexity to the transform or assume they're rare enough and let them hit censorship.
    • will implement as split by regexp unless somebody stops me
  • Should it be opt-in or opt-out?
    • opt-out by unanimous opinion from people present in Endo meeting
    • implement
  • Does a step in the process between compartment-mapper and SES exist where the supported dynamic import call has been changed to its implementation and the only occurences of import( that are left need not be preserved? Could we use a simple regexp-replace there for a cheaper way to get the same result?
    • not a welcome option to create a potential for invalid programs in ModuleSource constructor
  • Do we even support eval('var a = import("somemodule")') or intend to ?

Security Considerations

  • Could the new transform be targeted to unlock malicious meaning in an otherwise harmless code? (it's meaning-preserving and AST based, so it seems unlikely)

Scaling Considerations

AST transform is slower than regexp, but for cases where import( doesn't exist in a string, it could be faster than a regexp search o the entire string.

Documentation Considerations

  • needs documenting after we decide opt-in or out
  • naming things is hard

Testing Considerations

  • added a testcase.
  • once we figure out opt-in/out a more end2end test in compartment-mapper would be nice.
  • more template string tests could help, the transform for them is more complex than I'd like, with room for off-by-one errors.

Compatibility Considerations

with the transform being meaning-preserving, this should not be breaking for anyone. Also, considering opt-in

@@ -0,0 +1,171 @@
const evadeRegexp = /import\s*\(|<!--|-->/g;
Copy link
Member Author

@naugtur naugtur Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this now provides multiple evasions in one go

@naugtur naugtur marked this pull request as ready for review December 18, 2025 14:08
Comment on lines +185 to +193
const multilineimport =\`␊
console.log(${a})␊
await im${""}port('some-module');␊
console.log(${b})␊
\`;␊
const taggedtemplate = String.dedent\`␊
await import('some-module');␊
\`;␊
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there this difference between tagged vs. untagged template literal contents?

And as an aside, I think this test would be better if the input and expected output appeared in the same file (i.e., not using t.snapshot). It's valuable for high volume, but this use seems to come down on the wrong side of the tradeoff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the tag function would see an extra split and empty string value in it. Where without a tag the effects of having an extra hole filled with an empty string are unobservable.

Copy link
Member

@gibson042 gibson042 Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's more code to get right, but if we're making this change, we should be thorough about it, e.g.:

        const taggedtemplate = makeEvadedTemplateApplier(String.dedent)`
            await ${IMPORT}('some-module');
        `;

where IMPORT is a unique global sentinel such as Symbol('import') and makeEvadedTemplateApplier is a global function like

const makeEvadedTemplateApplier =
  fn =>
  (T0, ...A0) => {
    const argCount = A0.length;
    const T1 = getTemplateMapping(T0) || [];
    const args = [T1];
    if (!T1.length) {
      const raw0 = T0.raw;
      const raw1 = [];
      for (let i = 0, j = 0; i <= argCount; ++i, ++j) {
        T1[j] = T0[i];
        raw1[j] = raw0[i];
        while (i < argCount && A0[i] === IMPORT) {
          i++;
          T1[j] += `import${T0[i]}`;
          raw1[j] += `import${raw0[i]}`;
        }
      }
      defineProperty(T1, 'raw', {
        ...{ writable: false, enumerable: false, configurable: false },
        value: freeze(raw1),
      });
      setTemplateMapping(T0, freeze(T1));
    }
    for (let i = 0, j = 0; i < argCount; ++i) {
      if (A0[i] !== IMPORT) args[++j] = A0[i];
    }
    return apply(fn, undefined, args);
  };

(which AFAIK is unobservable, until/unless https://github.com/tc39/proposal-array-is-template-object advances to Stage 4)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would have to put all that runtime in a single line of code, and it’s a lot. Might have to do some hard trade-off calculus. I’m in favor of making monotonic progress until we hit a cost cliff.

Copy link
Member

@gibson042 gibson042 Dec 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know to what extent minification is on the table for injected helpers like the above, but if we are comfortable with it then the cost need be no more than 482 bytes, and only necessary when the source text being transformed has one or more tagged template literals with static contents that would be affected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about the potential issues with sourcemaps or having to find a place to put it. Also, the level of complexity I'm not comfortable with introducing.
It's a complex fix for a very unlikely problem. And if a module contains a tagged template string with an import statement in it, it's likely problematic in many other ways.

HTML in tagged template strings is somewhat common, but as a virtual dom component, where HTML comments don't make sense.
I fail to find an example of tagged template strings containing hardcoded javascript with imports in it.

I'd like to avoid shipping a tagged template string transform for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sure would like to avoid that complexity as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tentatively volunteered to co-champion it. Feed me your concerns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary concern is taking away the ability to locally virtualize tagged template literals, which is a steep price to pay for what seems to be negligible benefit.

@naugtur

This comment was marked as outdated.

boneskull

This comment was marked as resolved.

@naugtur naugtur force-pushed the naugtur/more-evasions branch from 8609251 to 6939389 Compare January 21, 2026 12:50
@naugtur naugtur requested review from gibson042 and mhofman January 21, 2026 12:56
Copy link
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR should update the evasive-transform README to mention that it also affects the representation of strings, regular expressions, and template literals (with caveats for last of those, particularly describing the gaps around cooked vs. raw and total disregard of tagged template literals).

/* import comment ...IMPORT('some-module');*/␊
const result = eval("...im"+"port('some-module'); await im"+"port(\\"other\\");");␊
const result2 = eval("...im"+"port('some-module'); await im"+"port(\\"other\\");");␊
const multilineimport =\`␊
Copy link
Member

@gibson042 gibson042 Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's strange loss of the space between = and \` here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just babel. Even in whitespace-preserving mode this space is at the end of a line in some meaning, so it's probably not represented anywhere.
Are we worried?

Also, impressive you noticed it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not worried, just confused.

Comment on lines +197 to +215
/**
* Prevents `-->` from appearing in output by adding
* an empty block comment to force spacing.
*
* @param {import('@babel/traverse').NodePath} p
* @returns {void}
*/
export const evadeDecrementGreater = p => {
if (
p.node.type === 'BinaryExpression' &&
p.node.operator === '>' &&
p.node.left.type === 'UpdateExpression' &&
p.node.left.operator === '--' &&
!p.node.left.trailingComments?.length
) {
// Add an empty block comment to force a space between -- and >
p.node.left.trailingComments = [{ type: 'CommentBlock', value: '' }];
}
};
Copy link
Member

@gibson042 gibson042 Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments can be stripped by downstream tools, recreating the problem—$x-->$y[$x--][0]>$y or (0,$x--)>$y is better IMO (cf. #1837 (comment) ).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But evasive transforms are applied to code last by design, before feeding code to SES. I don't know of a usecase for evasive transforms where the resulting string gets modified again.

We could still switch to it.
The only argument against is it could impact optimizations/inlining by pushing code across a complexity threshold or not matching a pattern someone hardcoded as optimizable. Is that a concern?
Should I go to the issue and comment there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But evasive transforms are applied to code last by design, before feeding code to SES. I don't know of a usecase for evasive transforms where the resulting string gets modified again.

I think we had one at Agoric, but my memory is fuzzy.

We could still switch to it. The only argument against is it could impact optimizations/inlining by pushing code across a complexity threshold or not matching a pattern someone hardcoded as optimizable. Is that a concern?

I certainly don't consider that a concern.

Should I go to the issue and comment there?

Feel free, but we're all aware that [$x--][0]>$y is slightly more expensive than $x--/**/>$y (it's just unlikely for that extra cost to matter in realistic code).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had a conversation at the Endo meeting that concluded in "if we doublecheck that there's no config to define a transform that runs after this one in Endo it's fine to use a comment"

Since you were not present, it's definitely not final.

I am concerned about the optimization and inlining

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm in a microbenchmarking mood, I ran the numbers—[$x--][0]>$y seems to take 25% to 70% longer than $x--/**/>$y in browser engines, but both forms are about as fast as anything can be so performance is not the relevant consideration here. However, there appears to be negligible overhead (if any) for (0,$x--)>$y:

#### V8
comment sync 33.9e+3 ops/ms (29.5e+0 ns/op) after 414 245760-count observations
comma   sync 33.7e+3 ops/ms (29.6e+0 ns/op) after 412 245760-count observations
array   sync 24.7e+3 ops/ms (40.4e+0 ns/op) after 278 267429-count observations (1 restart)

#### SpiderMonkey
comment sync 6977.19 ops/ms (143e+0 ns/op) after 137 153600-count observations
comma   sync 6927.05 ops/ms (144e+0 ns/op) after 64 329143-count observations
array   sync 5445.73 ops/ms (184e+0 ns/op) after 107 153600-count observations

#### JavaScriptCore
comment sync 7629.27 ops/ms (131e+0 ns/op) after 94 245760-count observations
comma   sync 7685.11 ops/ms (130e+0 ns/op) after 94 245760-count observations
array   sync 4557.14 ops/ms (219e+0 ns/op) after 112 122880-count observations
Details
for engine in V8 SpiderMonkey JavaScriptCore; do
  esbench.mjs -h "$engine" -b3 \
    -s '
      const length = 10000, numbers = Array.from({ length }, (_, i) => i);
      let i = length - 1;
    ' \
    comment:'result = numbers[++i % length]; result = result--/**/>result;' \
    comma:'result = numbers[++i % length]; result = (0,result--)>result;' \
    array:'result = numbers[++i % length]; result = [result--][0]>result;'
done

So I'd advocate for the comma expression transformation, although this is not a hill worth dying on and I don't have a strong objection to comment-injection either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the comma transform. What are the advantages of comment injection over comma transform?

@changeset-bot
Copy link

changeset-bot bot commented Feb 3, 2026

⚠️ No Changeset found

Latest commit: a32be3c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@naugtur naugtur force-pushed the naugtur/more-evasions branch from 09e78c0 to a32be3c Compare February 9, 2026 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants