A high-performance .NET library for managing strings in unmanaged memory to reduce garbage collection pressure in string-intensive applications.
UnmanagedStringPool allocates a single contiguous block of unmanaged memory and provides string storage as lightweight PooledString structs. This approach eliminates per-string heap allocations and significantly reduces GC overhead in scenarios with high string throughput or large string datasets.
- Zero GC Pressure: Strings stored entirely in unmanaged memory
- Value Type Semantics:
PooledStringis a 12-byte struct with full copy semantics - Automatic Memory Management: Built-in defragmentation, growth, and coalescing
- Thread-Safe Reads: Multiple threads can read strings concurrently
- Memory Efficient: 8-byte alignment, free block coalescing, size-indexed allocation
- Safe Design: Allocation IDs prevent use-after-free bugs
Traditional .NET strings are immutable objects on the managed heap. In high-throughput scenarios (parsers, caches, data processing), this creates significant GC pressure:
- Each string allocation triggers potential GC
- Gen 0 collections become frequent
- Large strings promote to Gen 2, causing expensive full GCs
- Memory fragmentation from many small string objects
-
Finalizers are needed to ensure unmanaged memory cleanup, but structs don't support them. We need a class.
-
A class-per-string would create significant GC load (even if the strings were stored in unmanaged memory), so instead the finalizable class represents a 'pool', which can hold several strings and performs just one unmanaged memory allocation.
-
Instances of individual pooled strings are structs, pointing into a pool object. They have full copy semantics and don't involve any heap allocation.
-
The pool implements IDisposable, with a finalizer, for memory safety.
-
Invalid pointers are never dereferenced. If the pool is disposed, any string structs relying on it automatically become invalid.
-
The pool is deterministically freed, but the tiny pool object itself gets GC's normally (about 100 bytes)
-
If the string within the pool is freed, the
allocation_idis not reused so any string structs pointing to it become invalid. Reusing the memory in the pool will result in a different id, preventing old string structs pointing to the new string. -
Freed space in the pool is reused where possible, and periodically compacted
-
Single Memory Block: One large allocation instead of thousands of small ones reduces OS memory management overhead
-
Struct-Based References:
PooledStringstructs (12 bytes) are stack-allocated or embedded in other structs, eliminating heap allocations for references -
Allocation IDs: Each allocation gets a unique, never-reused ID. This prevents dangling references - if a string is freed and reallocated, old
PooledStringinstances become safely invalid -
Automatic Defragmentation: At 35% fragmentation threshold, the pool automatically compacts memory, updating all internal references transparently
-
Size-Indexed Free Lists: Free blocks are tracked by size buckets for O(1) best-fit allocation
Ideal for:
- High-frequency string parsing and processing
- Large in-memory caches with string keys/values
- Protocol buffers and message processing
- Game engines with extensive text/localization
- Any scenario where string GC becomes a bottleneck
Not recommended for:
- Long-lived strings that rarely change
- Scenarios requiring string interning
- Applications with low string allocation rates
// Create a pool with 1MB initial size
using var pool = new UnmanagedStringPool(1024 * 1024);
// Allocate strings
PooledString str1 = pool.AllocateString("Hello, World!");
PooledString str2 = pool.AllocateString("Unmanaged strings!");
// Create empty strings (optimized - no memory allocation)
PooledString empty = pool.CreateEmptyString();
// Use spans directly to avoid heap allocations
Console.Out.WriteLine(str1.AsSpan()); // Console.Out.WriteLine accepts ReadOnlySpan<char>
int length = str2.Length;
char firstChar = str2[0];
// Strings can be explicitly freed
str1.Dispose();
// Pool automatically cleans up remaining allocations on disposalEmpty strings receive special optimization:
- Use reserved allocation ID (0) with no actual memory allocation
- Remain valid for read operations even after other strings are freed
- Important: Become invalid after pool disposal since operations like
Insert()require the pool to allocate memory for the resulting non-empty string - All empty strings from any pool are considered equal
The project includes comprehensive test coverage across multiple areas:
- UnmanagedStringPoolTests.cs: Core functionality and basic operations
- UnmanagedStringPoolEdgeCaseTests.cs: Edge cases and error conditions
- FragmentationAndMemoryTests.cs: Memory management and defragmentation
- FragmentationTest.cs: Specific fragmentation scenarios
- PooledStringTests.cs: String operations and manipulations
- ConcurrentAccessTests.cs: Thread safety and concurrent operations
- DisposalAndLifecycleTests.cs: Object disposal and lifecycle management
- FinalizerBehaviorTests.cs: Finalizer and GC interaction tests
- ClearMethodTests.cs: Pool clearing operations
- IntegerOverflowTests.cs: Overflow protection and boundary conditions
- Allocation: O(1) average case with size-indexed free lists
- Deallocation: O(1) with immediate coalescing
- Defragmentation: O(n) where n is active allocations, triggered automatically
- Memory Overhead: ~8 bytes per allocation for alignment and metadata
- Growth: Configurable growth factor (default 2x) when pool exhausted
- Read Operations: Fully thread-safe
- Write Operations: Require external synchronization
- Disposal: Not thread-safe, ensure exclusive access
# Build
dotnet build
# Run tests
dotnet test
# Run with detailed output
dotnet test --logger:"console;verbosity=detailed"
# Run specific test class
dotnet test --filter "FullyQualifiedName~UnmanagedStringPoolTests"
# Format code (important after making changes)
dotnet formatFor detailed information about the test suite and coverage areas, see Tests/README.md.
- .NET 9.0 or later
MIT