Difference between malloc and calloc?

calloc() gives you a zero-initialized buffer, while malloc() leaves the memory uninitialized.

For large allocations, most calloc implementations under mainstream OSes will get known-zeroed pages from the OS (e.g. via POSIX mmap(MAP_ANONYMOUS) or Windows VirtualAlloc) so it doesn't need to write them in user-space. This is how normal malloc gets more pages from the OS as well; calloc just takes advantage of the OS's guarantee.

This means calloc memory can still be "clean" and lazily-allocated, and copy-on-write mapped to a system-wide shared physical page of zeros. (Assuming a system with virtual memory.) The effects are visible with performance experiments on Linux, for example.

Some compilers even can optimize malloc + memset(0) into calloc for you, but it's best to just use calloc in the source if you want zeroed memory. (Or if you were trying to pre-fault it to avoid page faults later, that optimization will defeat your attempt.)

If you aren't going to ever read memory before writing it, use malloc so it can (potentially) give you dirty memory from its internal free list instead of getting new pages from the OS. (Or instead of zeroing a block of memory on the free list for a small allocation).


Embedded implementations of calloc may leave it up to calloc itself to zero memory if there's no OS, or it's not a fancy multi-user OS that zeros pages to stop information leaks between processes.

On embedded Linux, malloc could mmap(MAP_UNINITIALIZED|MAP_ANONYMOUS), which is only enabled for some embedded kernels because it's insecure on a multi-user system.


The documentation makes the calloc look like malloc, which just does zero-initialize the memory; this is not the primary difference! The idea of calloc is to abstract copy-on-write semantics for memory allocation. When you allocate memory with calloc it all maps to same physical page which is initialized to zero. When any of the pages of the allocated memory is written into a physical page is allocated. This is often used to make HUGE hash tables, for example since the parts of hash which are empty aren't backed by any extra memory (pages); they happily point to the single zero-initialized page, which can be even shared between processes.

Any write to virtual address is mapped to a page, if that page is the zero-page, another physical page is allocated, the zero page is copied there and the control flow is returned to the client process. This works same way memory mapped files, virtual memory, etc. work.. it uses paging.

Here is one optimization story about the topic: http://blogs.fau.de/hager/2007/05/08/benchmarking-fun-with-calloc-and-zero-pages/


A less known difference is that in operating systems with optimistic memory allocation, like Linux, the pointer returned by malloc isn't backed by real memory until the program actually touches it.

calloc does indeed touch the memory (it writes zeroes on it) and thus you'll be sure the OS is backing the allocation with actual RAM (or swap). This is also why it is slower than malloc (not only does it have to zero it, the OS must also find a suitable memory area by possibly swapping out other processes)

See for instance this SO question for further discussion about the behavior of malloc


One often-overlooked advantage of calloc is that (conformant implementations of) it will help protect you against integer overflow vulnerabilities. Compare:

size_t count = get_int32(file);
struct foo *bar = malloc(count * sizeof *bar);

vs.

size_t count = get_int32(file);
struct foo *bar = calloc(count, sizeof *bar);

The former could result in a tiny allocation and subsequent buffer overflows, if count is greater than SIZE_MAX/sizeof *bar. The latter will automatically fail in this case since an object that large cannot be created.

Of course you may have to be on the lookout for non-conformant implementations which simply ignore the possibility of overflow... If this is a concern on platforms you target, you'll have to do a manual test for overflow anyway.

Tags:

C

Malloc

Calloc