How to implement a memory heap
Here is the classic allocator, and one of the best for non-multithreaded use:
http://gee.cs.oswego.edu/dl/html/malloc.html
You can learn a lot from reading the explanation of its design. The link to malloc.c
in the article is rotted; it can now be found at http://gee.cs.oswego.edu/pub/misc/malloc.c.
With that said, unless your program has really unusual allocation patterns, it's probably a very bad idea to write your own allocator or use a custom one. Especially if you're trying to replace the system malloc
, you risk all kinds of bugs and compatibility issues from different libraries (or standard library functions) getting linked to the "wrong version of malloc
".
If you find yourself needing specialized allocation for just a few specific tasks, that can be done without replacing malloc
. I would recommend looking up GNU obstack
and object pools for fixed-sized objects. These cover a majority of the cases where specialized allocation might have real practical usefulness.
Most C and C++ compilers already provide a heap memory-manager as part of the standard library, so you don't need to do anything at all in order to avoid hitting the OS with every request.
If you want to improve performance, there are a number of improved allocators around that you can simply link with and go. e.g. Hoard, which wheaties mentioned in a now-deleted answer (which actually was quite good -- wheaties, why'd you delete it?).
If you want to write your own heap manager as a learning exercise, here are the basic things it needs to do:
- Request a big block of memory from the OS
- Keep a linked list of the free blocks
- When an allocation request comes in:
- search the list for a block that's big enough for the requested size plus some book-keeping variables stored alongside.
- split off a big enough chunk of the block for the current request, put the rest back in the free list
- if no block is big enough, go back to the OS and ask for another big chunk
- When a deallocation request comes in
- read the header to find out the size
- add the newly freed block onto the free list
- optionally, see if the memory immediately following is also listed on the free list, and combine both adjacent blocks into one bigger one (called coalescing the heap)
- Yes, both stdlib heap and OS heap / virtual memory are pretty troublesome. OS calls are really slow, and stdlib is faster, but still has some "unnecessary" locks and checks, and adds a significant overhead to allocated blocks (ie some memory is used for management, in addition to what you allocate).
- In many cases its possible to avoid dynamic allocation completely, by using static structures instead. For example, sometimes its better (safer etc) to define a 64k static buffer for unicode filename, than define a pointer/std:string and dynamically allocate it.
- When the program has to allocate a lot of instances of the same structure, its much faster to allocate large memory blocks and then just store the instances there (sequentially or using a linked list of free nodes) - C++ has a "placement new" for that.
- In many cases, when working with varible-size objects, the set of possible sizes is actually very limited (eg. something like 4+2*(1..256)), so its possible to use a few pools like [3] without having to collect garbage, fill the gaps etc.
- Its common for a custom allocator for specific task to be much faster than one(s) from standard library, and even faster than speed-optimized, but too universal implementations.
- Modern CPUs/OSes support "large pages", which can significantly improve the memory access speed when you explicitly work with large blocks - see http://7-max.com/
You allocate a chunk of memory at the beginning of the program large enough to sustain its need. Then you have to override new and/or malloc, delete and/or free to return memory from/to this buffer.
When implementing this kind of solution, you need to write your own allocator(to source from the chunk) and you may end up using more than one allocator which is often why you allocate a memory pool in the first place.
Default memory allocator is a good all around allocator but is not the best for all allocation needs. For example, if you know you'll be allocating a lot of object for a particular size, you may define an allocator that allocates fixed size buffer and pre-allocate more than one to gain some efficiency.