Introduction
When I started updating Cdecl, one of the first things I did was, as I then described, to use an abstract syntax tree (AST) data structure to represent a declaration. During construction of the AST, placeholder nodes sometimes need to be created until the final type is able to be determined. For example, consider the declaration:
int a[2][3];
When parsing, upon encountering the first [, we know it’s an array 2 of “something” of int, but we don’t yet know either what the “something” is or whether it will turn out to be nothing. It’s not until the second [ that we know it’s an array 2 of array 3 of int. (Had the [3] not been there, then it would have been just array 2 of int.) Once the final type is known, the placeholder node can be replaced by a new code of the correct type.
You might think all that needs to be done is to call free on the placeholder node. You could, but then you’d have to ensure that there are no other pointers pointing to it, not only in the tree, but in local variables in stack frames of functions on the call stack lest you get a dangling pointer. Simpler, would be to use arenas.
Arenas
An arena typically is a large chunk of dynamically allocated memory from which actual objects of interest are subsequently allocated from using a custom allocator. There are many ways to implement arenas depending upon the answers to the following questions:
- Will all objects that will be allocated from it be of the same type (or at least size) or different types (or sizes)?
- Will the maximum number of objects that will be allocated from it be known in advance?
- If not, will the arena need to grow as needed?
- Will individual objects need to be deallocated from the arena?
- Does it need to be thread-safe? (If your particular program doesn’t use threads, then the answer is “no.”)
About the only thing all arena implementations have in common is that freeing the arena frees all objects in it — which is one of the main reasons for using an arena in the first place.
An Arena for Cdecl
In the case of Cdecl, the answers to the previous questions are:
- Yes. (All objects are AST nodes.)
- No. (Won’t know maximum number of objects.)
- Yes. (Will need to grow as needed.)
- No. (Individual objects will not need deallocation.)
- No. (Does not need to be thread-safe.)
Given those answers, all that’s needed to implement an arena is a simple, singly linked list, something like:
typedef struct slist slist;
struct slist {
slist *next;
void *data;
};
void slist_push_front( slist **phead, void *data ) {
slist *const new = malloc( sizeof *new );
assert( new != NULL );
*new = (slist){ .next = *phead, .data = data };
*phead = new;
}
Then to put all AST nodes into the arena:
c_ast_t c_ast_new( /* params */, slist **parena ) {
c_ast_t *const ast = malloc( *ast );
// ...
slist_push_front( parena, ast );
return ast;
}
Finally, to free all AST nodes in one go (which is the whole point):
void slist_cleanup( slist **phead,
void (*free_fn)( void* ) ) {
slist *node = *phead, *next;
for ( ; node != NULL; node = next ) {
next = node->next;
if ( free_fn != NULL )
(*free_fn)( node->data );
free( node );
}
*phead = NULL;
}
As arenas go, a linked list is pretty simple. It might even be a stretch to call it an arena at all. In the case of Cdecl, it’s good enough — don’t over-engineer things.
More Complex Arenas
A slightly more complex arena would be one where you build on top of the linked list where each node would be a “block” of objects. For example, the first block could hold a maximum of, say, 8 objects. If you run out of space, you can allocate more blocks. Each can either be the same size or double the previous size up to a maximum size:
typedef struct arena arena;
struct arena {
slist *blocks;
size_t block_size_min, block_size_max;
size_t obj_size;
size_t avail;
};
void arena_init( arena *a,
size_t block_size_min,
size_t block_size_max,
size_t obj_size ) {
*a = (arena){
.block_size_min = block_size_min,
.block_size_max = block_size_max,
.obj_size = obj_size
};
}
A block will store both its size and the memory for objects:
typedef struct arena_block arena_block;
struct arena_block {
size_t size;
alignas( max_align_t ) char data[];
};
This uses a flexible array member for the object storage. Given that, we can implement an allocator:
void* arena_alloc( arena *a ) {
if ( a->avail == 0 ) {
arena_block *block = arena_node2block( a->blocks );
size_t const block_size = block == NULL ?
a->block_size_min :
block_size <= a->block_size_max / 2 ?
block_size << 1 :
a->block_size_max;
block = malloc( sizeof(arena_block) + block_size * a->obj_size );
assert( block != NULL );
a->avail = block->size = block_size;
slist_push_front( &a->blocks, block->data );
}
return &a->blocks->data[ --a->avail * a->obj_size ];
}
If a->avail is 0, it means we have to allocate a new block:
- Ignoring
arena_node2blockfor now, ifblockisNULL, there are no blocks, soblock_sizewill beblock_size_min. - Otherwise, if doubling
block_sizewon’t exceedblock_size_max, then double it. - Otherwise, use
block_size_max.
Then allocate memory for a block and block_size objects and push data — not block — onto the head of the linked list of blocks. This makes it simpler to calculate the address of any object in a block as it done on the return line.
The size is “hidden” in memory before the object pointed to by slist’s data. This trick is commonly used in allocators. Indeed, it’s typically how malloc stores the size of a chunk of memory so that it later can be recovered.
But, given an slist* for a block, how can you get at the block’s size? There is where arena_node2block comes in:
static inline arena_block* arena_node2block( slist *node ) {
return node == NULL ? NULL :
(arena_block*)(node->data - offsetof(arena_block, data));
}
The expression node->data, being an array, “decays” into a pointer to its first element. In memory, size, as mentioned, is stored before that. You might think that it’s just sizeof(size_t) bytes before node->data — and it could be. The problem is that there may be padding between size and data. To subtract the correct number of bytes that includes the padding (if any), the standard macro offsetof can be used. In this case, it will return the offset of data into the structure that includes the padding (if any).
To implement a deallocator:
void arena_cleanup( arena *a, void (*free_fn)(void*) ) {
for ( slist *node = a->blocks, *next_node;
node != NULL; node = next_node ) {
arena_block *const block = arena_node2block( node );
if ( free_fn != NULL ) {
for ( size_t i = a->avail; i < block->size; ++i )
(*free_fn)( &block->data[ i * a->obj_size ] );
a->avail = 0;
}
free( block );
next_node = node->next;
free( node );
}
a->blocks = NULL;
}
The code is fairly straightforward. The only thing to remember is that the first block may not be full and you should call free_fn only only actual objects. Hence, we start iterating at a->avail rather than 0; then set a->avail to 0 for the remaining blocks that are all full.
Conclusion
Using arenas for object management can simplify memory management in the special case when an entire collection of objects are all deallocated together. Arenas can also be more performant since it required fewer calls to malloc and free. There are other possible arena implementations in addition to the two presented here. Keep arenas in mind for your next project.
Top comments (0)