Skip to content

Duplicate Operations and Their Handling in Scribe #24758

@bhavin121

Description

@bhavin121

Describe the bug

Recently, we discovered a potential issue where duplicate operations are being produced from Deli. These duplicate operations are handled in Scriptorium but not in Scribe. The logic in Scribe checks for incremental sequence numbers, which fails in the presence of duplicate operations. Consequently, Scribe marks the document as corrupted.

Reason for duplicate ops

Duplicate operations occurred because the Deli system failed during checkpointing but successfully produced data to Kafka. The checkpoint contains the last processed sequence number, essential for preventing the reprocessing of operations. Since the checkpointing failed, the last processed sequence number was not updated, causing Deli to reprocess operations that were already processed and produced to Kafka.

To Reproduce

Steps to reproduce the behavior:

  1. Deli service is crashed before checkpointing the state of document
  2. Which will result in reprocessing of ops and duplicate ops(deltas) are produced to kafka
  3. The document for which the duplicate ops were received is marked as corrupted

Expected behavior

Duplicate operations are always possible due to failure in checkpointing that cause reprocessing of operations. Therefore, it is essential for Scribe to have robust logic that can identify and handle these duplicate operations correctly. This would ensure that the document is not marked as corrupted erroneously. Implementing duplicate handling similar to Scriptorium would enhance the reliability and resilience of Service, preventing potential data corruption and ensuring consistent document processing.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions