(feat): Bitmask-Aware Untracked Tracking for @with_pool#16
Merged
Conversation
Phase 1 of typed-aware untracked tracking: add _untracked_fixed_masks
(Vector{UInt16}) and _untracked_has_others (Vector{Bool}) fields to
AdaptiveArrayPool. These parallel arrays follow the same 1-based sentinel
pattern as _untracked_flags. All lifecycle operations (checkpoint, rewind,
reset, empty) updated to push/pop/restore the new vectors.
No behavior change — existing _untracked_flags logic is untouched. The new
fields are populated with sentinel values but not yet read by any decision
logic. Prepares the data structure for Phase 2 (typed _mark_untracked!).
Phase 2 of typed-aware untracked tracking: replaces untyped
_mark_untracked!(pool) with typed _mark_untracked!(pool, ::Type{T})
across all 36 call sites (8 acquire.jl, 28 convenience.jl).
- Add _fixed_slot_bit dispatch mapping each fixed-slot type to UInt16 bit
- Rewrite _mark_untracked! to set per-type bitmask or has_others flag
- Bridge: legacy _untracked_flags still set for dual-track transition
- Add 9 test sets covering dispatch, marking, and public API propagation
Replace _untracked_flags boolean conditionals with _can_use_typed_path bitmask subset check in @with_pool macro-generated code. Adds _tracked_mask_for_types (@generated compile-time constant) and _can_use_typed_path (@inline runtime check) to state.jl. Simplifies 5 generator functions by centralizing typed/full path decision into _generate_typed_checkpoint_call and _generate_typed_rewind_call helpers, removing 10 inline conditional blocks from macros.jl.
… tracking
Phase 4 of typed-aware untracked tracking: remove the boolean
_untracked_flags::Vector{Bool} field from AdaptiveArrayPool and
CuAdaptiveArrayPool, now fully replaced by the fine-grained
_untracked_fixed_masks::Vector{UInt16} + _untracked_has_others::Vector{Bool}
bitmask system introduced in Phases 1-3.
Removes all push!/pop!/empty! calls for _untracked_flags across
checkpoint!, rewind!, reset!, and empty! in both CPU and CUDA paths.
The CUDA extension was missing _untracked_fixed_masks and _untracked_has_others
fields that were added to AdaptiveArrayPool during the bitmask untracked tracking
feature (Phases 1-4). Without these fields, any acquire!() call inside a CUDA
@with_pool scope would throw a FieldError via _mark_untracked!(), and
_can_use_typed_path() would also fail.
Changes:
- Add _untracked_fixed_masks::Vector{UInt16} and _untracked_has_others::Vector{Bool}
fields to CuAdaptiveArrayPool struct with sentinel initialization
- checkpoint!(full/typed-1/typed-N): push bitmask state on depth increment
- rewind!(full/typed-1/typed-N): pop bitmask state on depth decrement
- reset! and empty!: restore bitmask sentinel state ([UInt16(0)], [false])
- Multi-type checkpoint!/rewind! now deduplicate types at compile time (matching
CPU behavior from src/state.jl)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #16 +/- ##
==========================================
+ Coverage 96.76% 97.08% +0.31%
==========================================
Files 9 9
Lines 1176 1200 +24
==========================================
+ Hits 1138 1165 +27
+ Misses 38 35 -3
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR replaces the boolean _untracked_flags system with fine-grained per-type bitmask tracking, enabling @with_pool to preserve the fast typed checkpoint/rewind path even when untracked acquire! calls occur in helper functions, as long as those types are covered by the macro's tracked set.
Changes:
- Replaced single boolean flag with UInt16 bitmask for tracking which fixed-slot types had untracked acquires
- Added
_untracked_has_othersflag for non-fixed-slot types - Implemented bitmask subset check
_can_use_typed_pathto decide between typed and full checkpoint/rewind paths - Updated all checkpoint/rewind/reset/empty! functions to maintain bitmask state
- Modified macro code generation to emit conditional branches based on bitmask subset checks
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/types.jl | Added _fixed_slot_bit function mapping and new bitmask fields to AdaptiveArrayPool struct |
| src/state.jl | Updated checkpoint/rewind/reset/empty! to maintain bitmask state; added _tracked_mask_for_types and _can_use_typed_path helpers |
| src/macros.jl | Modified checkpoint/rewind call generation to use bitmask subset checks |
| src/acquire.jl | Updated _mark_untracked! to set type-specific bitmask bits instead of boolean flag |
| src/convenience.jl | Updated all convenience functions to pass type parameter to _mark_untracked! |
| ext/AdaptiveArrayPoolsCUDAExt/types.jl | Added bitmask fields to CuAdaptiveArrayPool struct |
| ext/AdaptiveArrayPoolsCUDAExt/state.jl | Updated CUDA pool state management with bitmask tracking; added duplicate type handling in @generated functions |
| test/test_state.jl | Added comprehensive tests for bitmask metadata lifecycle, type marking, subset checks, and end-to-end scenarios |
| test/test_macro_expansion.jl | Added tests verifying macro expansion uses new bitmask functions |
| docs/src/architecture/macro-internals.md | Updated documentation to explain bitmask-based tracking system |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mgyoo86
referenced
this pull request
Feb 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the boolean
_untracked_flagssystem with fine-grained per-type bitmask tracking, enabling@with_poolto keep the fast typed checkpoint/rewind path even when untrackedacquire!calls occur in helper functions — as long as those types are already covered by the macro's tracked set.Previously, any untracked
acquire!call forced a full checkpoint/rewind over all 8 fixed-slot types. Now, each untracked call records which type it touched via aUInt16bitmask, and the macro performs a subset check at runtime: ifuntracked ⊆ tracked, the typed (fast) path is preserved.Key metrics:
Motivation
The
@with_poolmacro analyzes its AST to extract which types are used, enabling typed (fast) checkpoint/rewind that only saves/restores those specific type pools (~77% faster than full). However,acquire!calls inside helper functions are invisible to the macro and were marked as "untracked" with a single boolean flag per depth level.This caused false-positive full rewinds in common patterns:
Design
Bitmask Subset Check
Each of the 8 fixed-slot types maps to a bit in a
UInt16via_fixed_slot_bit(T):Float64ComplexF64Float32ComplexF32Int64BoolInt32Bit(BitArray)Non-fixed-slot types (e.g.
UInt8) set a separate_untracked_has_othersflag, which always forces the full path.UInt16supports up to 16 fixed-slot types (8 currently used, 8 reserved); if more are needed, widening toUInt32/UInt64requires only a type alias change — all bitwise operations remain identical.The decision function performs a single-instruction subset check:
How the Typed Path Decision Works
The macro statically extracts types from
acquire!calls it can see. At runtime, the bitmask tracks what happened outside its visibility:mask=0→ trivially subsetFloat64, macro also tracksFloat64Float32, macro only tracksFloat64has_others=true→ always forces full pathuse_typed=falseat compile time, no bitmask check*Checkpoint runs before the helper, so untracked mask is still empty → typed. Rewind runs after, sees the actual mask → falls back to full if needed. This asymmetry is safe because
_rewind_typed_pool!uses depth-based orphan cleanup to restore types that were not checkpointed.Benchmark Results (Before → After)
Hardware: Apple Silicon, Julia 1.12.5, AdaptiveArrayPools v0.1.2
False-Positive Scenarios (optimization targets)
True-Positive Scenarios (should be unchanged)
Raw Checkpoint/Rewind Cost (unchanged)
Allocations: zero across all scenarios (before and after).
Representative Test Scenario