Back Home

Memory Management

Memory management in C is significantly constrained compared to literally any other language.
Correctness and performance are essential within a compiler.

The principal memory management primitives are located in alloc.h and alloc.c.

The basic interface is:

void arena_allocator_create(struct arena_allocator **allocator);
void arena_allocator_free(struct arena_allocator **allocator);

void *arena_alloc(struct arena_allocator *allocator, size_t size);

(There are many helper methods such as arena_alloc_strdup to perform C-stdlib functions within an arena).

Correctness

All debug & test builds use address sanitiser and leak sanitiser where available, including in CI. This catches the vast majority of leaks, which primarily occur from the few usages of non-arena such as in the top-level compiler instances and the linker invokers. The small number of arenas used in the project, seperated generally by compiler stage, make verifying that they are freed very easy.

Performance

The tree/graph-like nature of ASTs and IR mean allocations must be very fast. Parsing extremely large (100k lines+) files with ALWAYS_MALLOC, which does not use the arena, can lead to compile times with ~45% of time spent in malloc and ~50% spent in free, dominating program execution.

The core arena_alloc function is carefully optimised to be extremely fast.

// marked as COLD to prevent inlining and keep the fast path small, as this is not a likely branch
COLD static void *arena_alloc_large(struct arena_allocator *allocator,
                                    size_t size);

static void *arena_alloc_in_new(struct arena_allocator *allocator,
                                size_t aligned);


void *arena_alloc(struct arena_allocator *allocator, size_t size) {
  // deterministically return NULL for zero-sized allocations
  if (!size) {
    return 0;
  }

  size_t aligned = ROUND_UP(size, ALIGNMENT);

  struct arena *arena;

  // unlikely macro uses `__builtin_expect` macro to inform compiler this is an unlikely branch
  if (UNLIKELY(aligned > BLOCK_SIZE)) {
    // slow path
    return arena_alloc_large(allocator, size);
  } else {
    arena = allocator->last;

    void *allocation;
    if (LIKELY(try_alloc_in_arena(arena, aligned, &allocation))) {
      return allocation;
    }
  }

  // need to create a new arena
  return arena_alloc_in_new(allocator, aligned);
}

BLOCK_SIZE & ALIGNMENT are both tunable. By default, block size is 4mb and alignment is 16.
The ALWAYS_MALLOC macro can be defined to force all allocations go through malloc instead without interfering with consumers.

Containers

Originally, the container types (principally struct vector and struct hashtbl) allocated their memory via malloc. They have since been changed to support using arenas, and the majority of usages in the compile use these variants. At some point I will remove the old versions entirely, but it is not a priority.