Skip to content

Conversation

@grahamegrieve
Copy link
Contributor

No description provided.

Comment on lines +356 to +357
return str
.replace(/&/g, '&')

Check failure

Code scanning / CodeQL

Double escaping or unescaping High

This replacement may produce '&' characters that are double-unescaped
here
.

Copilot Autofix

AI 1 day ago

In general, to avoid double-unescaping when dealing with escape sequences, the escape character (here &) must be escaped first in the encoder and unescaped last in the decoder. This ensures that sequences like &amp;lt; are handled correctly: the < entity is decoded first to &lt;, and only then is &amp; decoded where appropriate, preventing incorrect collapse into raw meta-characters.

For this specific code, escapeXml is already correct: it escapes & first, then <, >, ", and '. The issue is in unescapeXml, which currently unescapes &amp; before the other entities. The best fix without changing existing functionality is to keep the same mappings but reorder the .replace calls so that &amp; is handled last. That is, unescapeXml should first replace &lt;, &gt;, &quot;, and &apos; with their character equivalents, and only then replace &amp; with &.

Concretely, in tx/xml/xml-base.js at lines 356–362, adjust the implementation of static unescapeXml(str) so that the .replace(/&amp;/g, '&') call is moved to the end of the chain. No new methods or imports are required.

Suggested changeset 1
tx/xml/xml-base.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/tx/xml/xml-base.js b/tx/xml/xml-base.js
--- a/tx/xml/xml-base.js
+++ b/tx/xml/xml-base.js
@@ -355,11 +355,11 @@
    */
   static unescapeXml(str) {
     return str
-      .replace(/&amp;/g, '&')
       .replace(/&lt;/g, '<')
       .replace(/&gt;/g, '>')
       .replace(/&quot;/g, '"')
-      .replace(/&apos;/g, "'");
+      .replace(/&apos;/g, "'")
+      .replace(/&amp;/g, '&');
   }
 
   /**
EOF
@@ -355,11 +355,11 @@
*/
static unescapeXml(str) {
return str
.replace(/&amp;/g, '&')
.replace(/&lt;/g, '<')
.replace(/&gt;/g, '>')
.replace(/&quot;/g, '"')
.replace(/&apos;/g, "'");
.replace(/&apos;/g, "'")
.replace(/&amp;/g, '&');
}

/**
Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants