Project Description
This was a mini competition run by the Vulnerability Research Interest Group (VRIG) at RITSEC during the start of Spring 2026. We studied modern allocator implementations — specifically Mimalloc, PartitionAlloc, TCMalloc, Scalloc, Scudo, and bmalloc — then built our own from scratch. The benchmark harness was created by Oleg and tests basic allocator behavior, memory fragmentation, throughput, tail latency, and a suite of security edge cases.
The benchmark covers correctness (malloc/free, realloc, calloc, alignment, size boundaries), stress and threaded suites (long realloc chains, 1M+ ops, producer-consumer, mixed sizes), fragmentation ("Swiss cheese" patterns, peak live set cycles), edge and security cases (invalid frees, double frees, stack frees / house of spirit, heap poisoning / house of lore, top chunk abuse / house of force), and realistic workloads replaying traces from Redis YCSB, SQLite TPC-C, and Firefox page loads.
This post covers my entry: Zialloc. The full project blog including entries from other VRIG members is published at blog.ritsec.club.
Zialloc Design
I'm somewhat continuing to work on this project, so the design doc may not reflect the latest security features and execution paths. The overall layout won't change though.
Size Model
Zialloc uses fixed size classes for regular allocations plus a directly-mapped XL path.
- Reserved heap virtual address space (default): 100 GB
- Segment size / alignment: 128 MiB
- Page classes:
SMALL(1 MiB),MEDIUM(8 MiB),LARGE(16 MiB),XL(direct mapping) - Chunk size thresholds:
512 KiB – 16 B(small),4 MiB – 16 B(medium),8 MiB – 16 B(large); above is XL - Regular requests are normalized: round to at least 16, round up to next power-of-two, clamp to class cap
- Final stride =
align_up(normalized_request + inline_header, 16)
Heap Layout
At initialization, zialloc reserves a large vmem region (100 GB) and commits 128 MiB segments from it on demand. It immediately seeds one segment each for small, medium, and large classes.
Hierarchy: Each segment is classed by size — all pages in a segment have the same page size. Within one page, all slots have identical stride/usable size and allocation state is tracked with a bitmap. XL allocations bypass the class system and are mmapped as standalone mappings with inline headers.
Metadata model (OOL — allocator-owned):
- Chunks resolve their owning page and slot index via pointer arithmetic on themselves
- Per-page: bitmap, used counts, owner TID, deferred-free ring
- Per-segment: class, page array, full-page count, chunk-geometry lock-in, integrity key/canary
- XL metadata is inline in front of the returned pointer (
XLHeader)
Allocation Workflow
Allocation enters through API wrappers (malloc, calloc, realloc) and dispatches to HeapState::allocate(size).
- Validate request size and compute size class (SM/MD/LG/XL)
- XL has a second chance: if it fits a large-page chunk geometry, reroute to large class
- Fast path: search thread-local cached pages by class
- Next path: search thread-local preferred segment
- Next path: shard queue of known non-full segments
- Next path: bounded scan of same-class segments
- Slow path: carve another segment from the pre-reserved virtual address space
- Fallback: mmap a new segment-aligned mapping
Bitmap/chunk behavior: The chunk allocator inside a page is bitmap-driven. Allocation searches from a hint, finds the first zero bit, marks it, writes a chunk header, and returns a pointer after the header. Free validates the header/magic/owner/slot, clears the bit, decrements used count, and updates first_hint. Double-free detection works by checking if a bitmap bit is already clear.
Free Workflow
Free enters through free() and dispatches to HeapState::free_ptr(ptr, usable_out). Note: "freeing" here means undoing physical mappings, not necessarily returning memory to the OS.
- Null free is ignored
- Inline
ChunkHeaderis parsed andCHUNK_MAGICis validated — invalid state aborts - If freeing thread is not the page owner, allocator attempts a deferred enqueue into the page-local lock-free ring
- If enqueue fails (queue full/contention), falls back to direct free
- Deferred frees are drained by the owner-side allocation path opportunistically when queue pressure is high
- Chunk free itself is a bitmap bit clear + used count decrement (+ optional zero-on-free)
- XL: checks
XL_MAGIC, optionally zeroes payload, unmaps the entire mapping
Deferred-Free Ring — Unintended Bonus
The deferred-free queue is a bounded per-page ring used to defer remote-thread mutation of pages it doesn't own. This gives a cheeky capability for detecting UAFs (if checks are enabled) and can delay reuse, acting as a pseudo temporal quarantine by preventing writes to pointers currently in the queue.
Security Strategy
Zialloc uses several integrity checks plus optional hardening toggles:
- Pointer/header ownership checks before free and usable-size operations
- Abort-on-corruption for invalid headers, bad transitions, and detected double frees
- Segment integrity key/canary check in validation path
- Optional zero-on-free memory scrubbing
- Optional UAF check path in
usable_size(aborts if slot is no longer marked allocated)
Known Limits
- Heap layout is not optimal; metadata lookup/access isn't as good as radix trees
- XL allocations are direct-mapped and behave differently from class-segmented allocations
- Segment classing and fixed chunk geometry per segment trade memory efficiency for predictable behavior
- Deferred ring for cross-thread frees is capped and may fall back to direct page free
Source Map
- API entrypoints, init/teardown, stats:
zialloc/alloc.cpp - Core allocator internals (heap/segment/page/cache/deferred free):
zialloc/segments.cpp - OS mapping/protection/reservation wrappers:
zialloc/os.cpp - Shared constants/macros/enums:
zialloc/types.h,zialloc/mem.h - Memory interface declarations:
zialloc/zialloc_memory.hpp
API Surface
Supported: malloc, free, realloc, calloc, usable_size, print_stats, validate_heap, get_stats, init, teardown
Not yet implemented: memalign, aligned_alloc, free_sized, realloc_array, bulk_free
Other Entries
Two other VRIG members submitted allocators for the competition:
- share-rdAlloc — A 3-tiered allocator inspired by TCMalloc. Reserves 1 TB of virtual memory at init with a 512 GB small heap, 512 GB medium heap, and mmap-backed massive allocations (>8 MB). Uses thread-local free lists with batch refills of 256 objects and a per-class mutex for medium tiers. Named after an inside joke.
- dualalloc — Supports two modes: a simple per-allocation mmap/munmap default mode, and a GAMBLE_MODE arena-based design with multiple arenas, spans, and block metadata including magic value validation. By Dylan Pachan.
The full technical details for all three allocators are on the RITSEC blog.