Skip to content

Conversation

@Gusarich
Copy link
Collaborator

@Gusarich Gusarich commented Dec 16, 2025

This optimization splits case-insensitive (CI) character constraints into two categories: ci_const (checks on bytes that don't depend on the swept hash0 byte or CRC) and ci_var (checks that touch bytes 2, 34, or 35). By moving ci_const checks outside and before the hash0 sweep loop, candidates can be rejected early without entering the expensive 256-iteration CRC sweep at all. For end-pattern matching, the last ~6 characters span bytes 31-35, meaning roughly half the CI checks fall on bytes 31-33 (invariant across the sweep) and can filter out ~99.99% of candidates before the loop even begins. Start patterns don't benefit because their CI characters overlap with byte 2 (the swept byte), making all their checks ci_var. A secondary optimization replaces full 34-byte CRC recomputation with a precomputed delta table (crc_base ^ delta[hash0]), reducing in-loop CRC cost from 34 operations to 1 XOR, but the early-exit filtering is the primary driver of the 8-18× speedup observed for end case-insensitive patterns.

Screenshot 2025-12-16 at 1 25 24 PM

@Gusarich Gusarich changed the title feat: optimize crc16 feat: hoist invariant CI checks before CRC sweep for early rejection Dec 16, 2025
@Gusarich Gusarich merged commit 1285cde into main Dec 16, 2025
1 check passed
@Gusarich Gusarich deleted the crc16-optim branch December 16, 2025 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants