Skip to content

java.lang.InternalError: a fault occurred in an unsafe memory access operation (Zilla 0.9.173) #1610

@nikhil-cli

Description

@nikhil-cli

Zilla crashes while serving a request routed through the Kafka cache.
The engine terminates after throwing:

java.lang.InternalError: a fault occurred in an unsafe memory access operation

This originates from the Kafka cache internals (KafkaCacheFile.readBytes).
Immediately afterwards, many repeated messages appear:

java.lang.IllegalStateException: Engine worker usage is non-zero: <various negative or small values>

Finally the engine stops:

ENGINE_STOPPED Engine Stopped.

This causes the entire Zilla container to crash.


To Reproduce

  1. Deploy Zilla 0.9.173 with Kafka cache enabled.
  2. Make a request that triggers a Kafka fetch via cache (e.g., SSE route consuming from Kafka).
  3. Zilla attempts to read cached data.
  4. The engine crashes with an internal memory fault.

Expected Behavior

Zilla should safely read from the Kafka cache without triggering unsafe memory faults or crashing the entire engine.


Crash Logs

java.lang.InternalError: a fault occurred in an unsafe memory access operation
    at KafkaCacheFile.readBytes(...)
    at KafkaCacheCursor.next(...)
    at KafkaCacheClientFetchStream.doClientReplyDataIfNecessary(...)
    ...
Caused by: java.lang.InternalError: a fault occurred in an unsafe memory access operation

Followed by dozens of:

java.lang.IllegalStateException: Engine worker usage is non-zero: -263
java.lang.IllegalStateException: Engine worker usage is non-zero: -164
java.lang.IllegalStateException: Engine worker usage is non-zero: 1
...
ENGINE_STOPPED Engine Stopped.

Zilla Environment

  • Zilla version: 0.9.173
  • Running inside Kubernetes
  • Using Kafka cache (kafka_cache_server / kafka_cache_client)
  • Route: SSE → Kafka via cache

Kafka Environment

  • Standard Kafka cluster
  • Topic being consumed contains time-series style data
  • No unusual Kafka errors seen outside Zilla

Additional Context

  • Crash happens inside unsafe memory access within the Kafka cache, not the TCP/Kafka client.
  • This appears to be low-level memory corruption or an edge-case triggered by reading cache segments.
  • After the crash, workers report negative usage counts, suggesting resource-accounting corruption after the fault.
  • Container restarts automatically after engine shutdown.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions