Skip to content

implement non-blocking proof submission service#60

Open
JupiterXiaoxiaoYu wants to merge 8 commits intomainfrom
feature/non-blocking-proof-submission
Open

implement non-blocking proof submission service#60
JupiterXiaoxiaoYu wants to merge 8 commits intomainfrom
feature/non-blocking-proof-submission

Conversation

@JupiterXiaoxiaoYu
Copy link
Contributor

  • Add ProofSubmissionService with Redis-based task stack for async proof processing
  • Replace blocking submitProofWithRetry with non-blocking addTaskToStack
  • Implement fault-tolerant recovery mechanism for service interruptions
  • Add precise merkle root matching for task confirmation
  • Support null placeholder strategy in trackBundle for immediate returns
  • Add comprehensive logging for background proof submission and query process

@JupiterXiaoxiaoYu JupiterXiaoxiaoYu force-pushed the feature/non-blocking-proof-submission branch 2 times, most recently from 3c884eb to 1a8f0b3 Compare September 1, 2025 03:07
@JupiterXiaoxiaoYu
Copy link
Contributor Author

JupiterXiaoxiaoYu commented Sep 1, 2025

TEST 1 LOCALLY

IMAGE=466A004EA81524416B3D3D7319C4E0B2 DEPLOY=TRUE make run

https://explorer.zkwasmhub.com/image/466A004EA81524416B3D3D7319C4E0B2

Dry run failed

@JupiterXiaoxiaoYu
Copy link
Contributor Author

JupiterXiaoxiaoYu commented Sep 1, 2025

TEST 2 LOCALLY

IMAGE=7BEC18ECC8F40ED6716639D1DC97A089 DEPLOY=TRUE make run

Change length of SERVER_ADMIN_KEY from 4 to 7, worked.

https://explorer.zkwasmhub.com/image/7BEC18ECC8F40ED6716639D1DC97A089

Turn on Flight Mode (No internet)

[ProofService] Task task-1756697616743-ztr6wdlvn failed with non-timeout error (attempt 1/3): Error: getaddrinfo EAI_AGAIN rpc.zkwasmhub.com

Turn off and Internet back:

[ProofService] Proof submission successful for task task-1756697616743-ztr6wdlvn, got taskId: 68b514880986b1b548e46ecc

SyntaxError: Unexpected token u in JSON at position 0
at JSON.parse ()
at ProofSubmissionService.deserializeTask (file:///mnt/d/Dev/automata/zkwasm-automata/ts/node_modules/zkwasm-ts-server/src/service/proof-submission-service.js:224:49)
at ProofSubmissionService.processNextTask (file:///mnt/d/Dev/automata/zkwasm-automata/ts/node_modules/zkwasm-ts-server/src/service/proof-submission-service.js:42:27)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Something wrong

@JupiterXiaoxiaoYu JupiterXiaoxiaoYu force-pushed the feature/non-blocking-proof-submission branch 5 times, most recently from ff7d15f to 21e805b Compare September 1, 2025 04:01
- Add ProofSubmissionService with Redis-based task stack for async proof processing
- Replace blocking submitProofWithRetry with non-blocking addTaskToStack
- Implement fault-tolerant recovery mechanism for service interruptions
- Add precise merkle root matching for task confirmation
- Support null placeholder strategy in trackBundle for immediate returns
- Add comprehensive logging for background proof submission and query process
@JupiterXiaoxiaoYu JupiterXiaoxiaoYu force-pushed the feature/non-blocking-proof-submission branch from 21e805b to 9327467 Compare September 1, 2025 04:06
@JupiterXiaoxiaoYu
Copy link
Contributor Author

JupiterXiaoxiaoYu commented Sep 1, 2025

TEST 3 - After fixing mini-rollup

Problem: After task success, Redis data was deleted but the stack ID remained, causing the next loop to fail
Solution: Immediately remove taskId from the stack after successful processing to ensure atomicity

IMAGE=10A61955978FA8E0AEC63F19DA4C50F0 DEPLOY=TRUE make run

worked for turn on and turn off flight mode

something wrong when interrupt and restart the service causing try run failed

- Change error handling to query confirm task status for all errors, not just timeouts
- Ensure no duplicate proof submissions by verifying task wasn't actually submitted
- Maintain separate retry logic for timeout vs non-timeout errors after confirmation
- Improve proof submission reliability in unstable network conditions
Convert BigInt values to strings before JSON serialization to prevent
'Do not know how to serialize a BigInt' errors when storing proof tasks
in Redis. This fixes task accumulation failures during network issues.
Fix TypeScript compilation errors by adding explicit type annotations
for map function parameters in serialize/deserialize methods.
Reduce query duration from 10 to 3 minutes and add early exit after
3 consecutive empty results to prevent unnecessary waiting when tasks
are genuinely not submitted.
Clarify that processCurrentTask only returns when task is completed
successfully, simplifying the control flow logic.
- Change lpush to rpush to ensure FIFO processing order for proof tasks
- Add validateMerkleContinuity method to check chain continuity before task processing
- Validate that current task's merkleRoot matches previous bundle's postMerkleRoot
- Fix MongoDB query syntax using $nin instead of duplicate $ne operators
- Prevent submission of tasks that would break merkle chain continuity
import { merkleRootToBeHexString } from '../lib.js';

interface ProofTask {
id: string;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

理清proofTask是发生在bundle save前还是bundle save后。目前的实现是发生在bundle save前,所以consumer其实可以从bundle出发去找对应的Task Proof(通过merkle关联),从而实现简单的生产消费者模式,而不是addTask的时候,有混合的生产者和消费者逻辑。

this.helper = helper;
}

async addTaskToStack(merkleRoot: BigUint64Array, txs: TxWitness[], txdata: Uint8Array): Promise<void> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

生产者函数内部尽量不要混合消费者逻辑。可以起一个while loop在service的main函数里面,从latest not committed bundle 出发依次提交proof。如果提交失败,就过段时间重试,继续从latest not committed 出发。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants