Fix two WASM memory growth bugs in asyncify builds (Asyncify + JSPI)#3317
Fix two WASM memory growth bugs in asyncify builds (Asyncify + JSPI)#3317
Conversation
Emscripten's handleSleep() calls _malloc() on every async operation and _free() on every rewind. If _malloc triggers memory.grow(), the underlying ArrayBuffer detaches and corrupts PHP state. Fix: allocate the Asyncify data buffer once (on first sleep) and reuse it on subsequent sleeps. On reuse, reinitialize the header and rewind func without reallocating. Remove the _free() call in the rewind path. This implements the optimization suggested by Emscripten's own TODO comment: "TODO: reuse, don't alloc/free every sleep" Only applied to asyncify builds (not JSPI) since JSPI doesn't use the full handleSleep implementation. emscripten-core/emscripten#21108
After memory.grow(), Emscripten's updateMemoryViews() reassigns the module-scoped HEAP8/HEAP16/HEAP32/HEAP64 variables to new typed arrays backed by the grown buffer. However, the memory object passed to bindUserSpace() captured the old typed array references at construction time, and bindUserSpace() further destructured them into local variables — both levels producing stale references pointing to the old, detached ArrayBuffer. This caused SQLITE_IOERR (error 10) during WordPress installation when WASM memory needed to grow: fcntl64(F_GETLK) would write "unlocked" status to the detached buffer via updateFlockStruct(), so SQLite's C code never saw the update and interpreted the stale data as a lock conflict. Fix both levels: - Emscripten library: use getters on the memory object so each property access returns the current typed array - wasm-user-space.ts: access HEAP arrays through the memory object reference instead of destructuring into local variables
Incorporates the Asyncify data caching and the HEAP getter fix into all asyncify builds for both node and web platforms, PHP 7.4 through 8.5.
Recompile all JSPI builds (node + web, PHP 7.4–8.5) to include the stale HEAP reference fix from phpwasm-emscripten-library.js. The fix has runtime effect only in node builds where bindUserSpace() is used, but web builds are rebuilt for consistency.
e1fa8cd to
e1abc41
Compare
| get HEAP8() { return HEAP8; }, | ||
| get HEAPU8() { return HEAPU8; }, | ||
| get HEAP16() { return HEAP16; }, | ||
| get HEAPU16() { return HEAPU16; }, | ||
| get HEAP32() { return HEAP32; }, | ||
| get HEAPU32() { return HEAPU32; }, | ||
| get HEAPF32() { return HEAPF32; }, | ||
| get HEAP64() { return HEAP64; }, | ||
| get HEAPU64() { return HEAPU64; }, | ||
| get HEAPF64() { return HEAPF64; }, |
There was a problem hiding this comment.
Thank you for catching this! I didn't know these were reassigned.
Instead of getters, which can possibly lead to mistakenly saving the current HEAP references in wasm-user-space.ts, it would be good to convert these into a get/set interface like:
| get HEAP8() { return HEAP8; }, | |
| get HEAPU8() { return HEAPU8; }, | |
| get HEAP16() { return HEAP16; }, | |
| get HEAPU16() { return HEAPU16; }, | |
| get HEAP32() { return HEAP32; }, | |
| get HEAPU32() { return HEAPU32; }, | |
| get HEAPF32() { return HEAPF32; }, | |
| get HEAP64() { return HEAP64; }, | |
| get HEAPU64() { return HEAPU64; }, | |
| get HEAPF64() { return HEAPF64; }, | |
| HEAP8: { | |
| get(offset) { return HEAP8[offset]; }, | |
| set(offset, value) { HEAP8[offset] = value; }, | |
| }, | |
| HEAPU8: { | |
| get(offset) { return HEAPU8[offset]; }, | |
| set(offset, value) { HEAPU8[offset] = value; }, | |
| }, | |
| HEAP16: { | |
| get(offset) { return HEAP16[offset]; }, | |
| set(offset, value) { HEAP16[offset] = value; }, | |
| }, | |
| HEAPU16: { | |
| get(offset) { return HEAPU16[offset]; }, | |
| set(offset, value) { HEAPU16[offset] = value; }, | |
| }, | |
| HEAP32: { | |
| get(offset) { return HEAP32[offset]; }, | |
| set(offset, value) { HEAP32[offset] = value; }, | |
| }, | |
| HEAPU32: { | |
| get(offset) { return HEAPU32[offset]; }, | |
| set(offset, value) { HEAPU32[offset] = value; }, | |
| }, | |
| HEAPF32: { | |
| get(offset) { return HEAPF32[offset]; }, | |
| set(offset, value) { HEAPF32[offset] = value; }, | |
| }, | |
| HEAP64: { | |
| get(offset) { return HEAP64[offset]; }, | |
| set(offset, value) { HEAP64[offset] = value; }, | |
| }, | |
| HEAPU64: { | |
| get(offset) { return HEAPU64[offset]; }, | |
| set(offset, value) { HEAPU64[offset] = value; }, | |
| }, | |
| HEAPF64: { | |
| get(offset) { return HEAPF64[offset]; }, | |
| set(offset, value) { HEAPF64[offset] = value; }, | |
| }, |
Or we could relay Proxy objects with get/set traps, so wasm-user-space.ts can continue accessing the HEAPs as regular ArrayBuffers.
| get HEAP8() { return HEAP8; }, | ||
| get HEAPU8() { return HEAPU8; }, | ||
| get HEAP16() { return HEAP16; }, | ||
| get HEAPU16() { return HEAPU16; }, | ||
| get HEAP32() { return HEAP32; }, | ||
| get HEAPU32() { return HEAPU32; }, | ||
| get HEAPF32() { return HEAPF32; }, | ||
| get HEAP64() { return HEAP64; }, | ||
| get HEAPU64() { return HEAPU64; }, | ||
| get HEAPF64() { return HEAPF64; }, |
There was a problem hiding this comment.
Thank you for catching this! I didn't know these were reassigned.
Instead of getters, which can possibly lead to mistakenly saving the current HEAP references in wasm-user-space.ts, it would be good to convert these into a get/set interface like:
| get HEAP8() { return HEAP8; }, | |
| get HEAPU8() { return HEAPU8; }, | |
| get HEAP16() { return HEAP16; }, | |
| get HEAPU16() { return HEAPU16; }, | |
| get HEAP32() { return HEAP32; }, | |
| get HEAPU32() { return HEAPU32; }, | |
| get HEAPF32() { return HEAPF32; }, | |
| get HEAP64() { return HEAP64; }, | |
| get HEAPU64() { return HEAPU64; }, | |
| get HEAPF64() { return HEAPF64; }, | |
| HEAP8: { | |
| get(offset) { return HEAP8[offset]; }, | |
| set(offset, value) { HEAP8[offset] = value; }, | |
| }, | |
| HEAPU8: { | |
| get(offset) { return HEAPU8[offset]; }, | |
| set(offset, value) { HEAPU8[offset] = value; }, | |
| }, | |
| HEAP16: { | |
| get(offset) { return HEAP16[offset]; }, | |
| set(offset, value) { HEAP16[offset] = value; }, | |
| }, | |
| HEAPU16: { | |
| get(offset) { return HEAPU16[offset]; }, | |
| set(offset, value) { HEAPU16[offset] = value; }, | |
| }, | |
| HEAP32: { | |
| get(offset) { return HEAP32[offset]; }, | |
| set(offset, value) { HEAP32[offset] = value; }, | |
| }, | |
| HEAPU32: { | |
| get(offset) { return HEAPU32[offset]; }, | |
| set(offset, value) { HEAPU32[offset] = value; }, | |
| }, | |
| HEAPF32: { | |
| get(offset) { return HEAPF32[offset]; }, | |
| set(offset, value) { HEAPF32[offset] = value; }, | |
| }, | |
| HEAP64: { | |
| get(offset) { return HEAP64[offset]; }, | |
| set(offset, value) { HEAP64[offset] = value; }, | |
| }, | |
| HEAPU64: { | |
| get(offset) { return HEAPU64[offset]; }, | |
| set(offset, value) { HEAPU64[offset] = value; }, | |
| }, | |
| HEAPF64: { | |
| get(offset) { return HEAPF64[offset]; }, | |
| set(offset, value) { HEAPF64[offset] = value; }, | |
| }, |
Or we could relay Proxy objects with get/set traps, so wasm-user-space.ts can continue accessing the HEAPs as regular ArrayBuffers.
Motivation for the change, related issues
This PR fixes two bugs that involve WASM
memory.grow(). Either bug alone can corrupt PHP state. They made WordPress installation fail with Asyncify at lowINITIAL_MEMORYvalues (tested at 64 MB and 16 MB).Bug 1: Stale HEAP references in file lock syscalls (Asyncify + JSPI)
After
memory.grow(), Emscripten'supdateMemoryViews()replaces the module-scopedHEAP8,HEAP16, … variables with new typed arrays backed by the grown buffer. Our file locking code captured these in two places:memoryobject passed tobindUserSpace()— a plain object literal that snapshotted the values at construction time.bindUserSpace()destructured them into local variables (memory: { HEAP16, HEAP64, HEAP32 }), creating a second level of stale capture.After any
memory.grow(), both levels pointed to the old, detachedArrayBuffer. When SQLite calledfcntl(F_GETLK)to check locks,updateFlockStruct()wrote "unlocked" to the detached buffer. The WASM code — reading from the actual (new) memory — never saw the update and treated the stale data as a lock conflict, resulting inSQLITE_IOERR(error code 10).Bug 2: Asyncify allocates memory on every async operation (Asyncify only)
Emscripten's
handleSleep()calls_malloc()on every async unwind and_free()on every rewind. If that_malloc()triggersmemory.grow(), theArrayBufferdetaches mid-operation, corrupting Asyncify's own state. The Asyncify source has a TODO comment acknowledging this:// TODO: reuse, don't alloc/free every sleep.Related: emscripten-core/emscripten#21108
Why the bugs were invisible until now
At 256 MB
INITIAL_MEMORY, WordPress fits without ever callingmemory.grow(), so neither bug manifests. At lower values, growth is needed during WordPress installation, and both bugs trigger.Implementation details
Bug 1 fix (
phpwasm-emscripten-library.js+wasm-user-space.ts):memoryobject now uses getters (get HEAP16() { return HEAP16; }) so every property access returns the current typed array.bindUserSpace()keeps thememoryobject reference instead of destructuring, accessingmemory.HEAP16[...]etc. at call time.Bug 2 fix (
Dockerfilepost-build patch, Asyncify only):Asyncify.allocateData()result is cached inAsyncify._cachedData. Subsequent sleeps reinitialize the header and rewind function pointer but reuse the same buffer, avoiding_malloc()on every async operation._free(Asyncify.currData)in the rewind path is removed.if [ "$WITH_JSPI" != "yes" ]— JSPI builds use a different suspend mechanism that doesn't have this problem.Rebuilt binaries: All Asyncify and JSPI builds (node + web, PHP 7.4–8.5) are rebuilt. The
wasm-user-space.tshalf of the bug 1 fix takes effect at runtime without a rebuild, but the emscripten getter fix is baked in at compile time.Testing Instructions (or ideally a Blueprint)
Verify with the CLI (uses the rebuilt binaries included in this PR):
To test at lower INITIAL_MEMORY (requires a local rebuild):