Why is MAP_GROWSDOWN mapping does not grow?
I know the OP has already accepted one of the answers, but unfortunately it does not explain why MAP_GROWSDOWN
seems to work sometimes. Since this Stack Overflow question is one of the first hits in search engines, let me add my answer for others.
The documentation of MAP_GROWSDOWN
needs updating. In particular:
This growth can be repeated until the mapping grows to within a page of the high end of the next lower mapping, at which point touching the "guard" page will result in a SIGSEGV signal.
In reality, the kernel does not allow a MAP_GROWSDOWN
mapping to grow closer than stack_guard_gap
pages away from the preceding mapping. The default value is 256, but it can be overridden on the kernel command line. Since your code does not specify any desired address for the mapping, the kernel chooses one automatically, but is quite likely to end up within 256 pages from the end of an existing mapping.
EDIT:
Additionally, kernels before v5.0 deny access to an address which is more than 64k+256 bytes below stack pointer. See this kernel commit for details.
This program works on x86 even with pre-5.0 kernels:
#include <sys/mman.h>
#include <stdint.h>
#include <stdio.h>
#define PAGE_SIZE 4096UL
#define GAP 512 * PAGE_SIZE
static void print_maps(void)
{
FILE *f = fopen("/proc/self/maps", "r");
if (f) {
char buf[1024];
size_t sz;
while ( (sz = fread(buf, 1, sizeof buf, f)) > 0)
fwrite(buf, 1, sz, stdout);
fclose(f);
}
}
int main()
{
char *p;
void *stack_ptr;
/* Choose an address well below the default process stack. */
asm volatile ("mov %%rsp,%[sp]"
: [sp] "=g" (stack_ptr));
stack_ptr -= (intptr_t)stack_ptr & (PAGE_SIZE - 1);
stack_ptr -= GAP;
printf("Ask for a page at %p\n", stack_ptr);
p = mmap(stack_ptr, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_STACK | MAP_ANONYMOUS | MAP_GROWSDOWN,
-1, 0);
printf("Mapped at %p\n", p);
print_maps();
getchar();
/* One page is already mapped: stack pointer does not matter. */
*p = 'A';
printf("Set content of that page to \"%s\"\n", p);
print_maps();
getchar();
/* Expand down by one page. */
asm volatile (
"mov %%rsp,%[sp]" "\n\t"
"mov %[ptr],%%rsp" "\n\t"
"movb $'B',-1(%%rsp)" "\n\t"
"mov %[sp],%%rsp"
: [sp] "+&g" (stack_ptr)
: [ptr] "g" (p)
: "memory");
printf("Set end of guard page to \"%s\"\n", p - 1);
print_maps();
getchar();
return 0;
}
Replace:
volatile char *c_ptr_1 = mapped_ptr - 4096; //1 page below
With
volatile char *c_ptr_1 = mapped_ptr;
Because:
The return address is one page lower than the memory area that is actually created in the process's virtual address space. Touching an address in the "guard" page below the mapping will cause the mapping to grow by a page.
Note that I tested the solution and it works as expected on kernel 4.15.0-45-generic.
First of all, you don't want MAP_GROWSDOWN
, and it's not how the main thread stack works. Analyzing memory mapping of a process with pmap. [stack] Nothing uses it, and pretty much nothing should use it. The stuff in the man page saying it's "used for stacks" is wrong and should be fixed.
I suspect it might be buggy (because nothing uses it so usually nobody cares or even notices if it breaks.)
Your code works for me if I change the mmap
call to map more than 1 page. Specifically, I tried 4096 * 100
. I'm running Linux 5.0.1 (Arch Linux) on bare metal (Skylake).
/proc/PID/smaps
does show a gd
flag.
And then (when single-stepping the asm) the maps
entry does actually change to a lower start address but the same end address, so it is literally growing downward when I start with a 400k mapping. This gives a 400k initial allocation above the return address, which grows to 404kiB when the program runs. (The size for a _GROWSDOWN
mapping is not the growth limit or anything like that.)
https://bugs.centos.org/view.php?id=4767 may be related; something changed between kernel versions in CentOS 5.3 and 5.5. And/or it had something to do with working in a VM (5.3) vs. not growing and faulting on bare metal (5.5).
I simplified the C to use ptr[-4095]
etc:
int main(void){
volatile char *ptr = mmap(NULL, 4096*100,
PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_STACK | MAP_GROWSDOWN,
-1, 0);
if(ptr == MAP_FAILED){
int error_code = errno;
fprintf(stderr, "Cannot do MAP_FIXED mapping."
"Error code = %d, details = %s\n", error_code, strerror(error_code));
exit(EXIT_FAILURE);
}
ptr[0] = 'a'; //address returned by mmap
ptr[-4095] = 'b'; // grow by 1 page
}
Compiling with gcc -Og
gives asm that's nice-ish to single-step.
BTW, various rumours about the flag having been removed from glibc are obviously wrong. This source does compile, and it's clear that it's also supported by the kernel, not silently ignored. (Although the behaviour I see with size 4096 instead of 400kiB is exactly consistent with the flag being silently ignored. However the gd
VmFlag is still there in smaps
, so it's not ignored at that stage.)
I checked and there was room for it to grow without coming close to another mapping. So IDK why it didn't grow when the GD mapping was only 1 page. I tried a couple times and it segfaulted each time. With the larger initial mapping it never faulted.
Both times were with a store to the mmap return value (the first page of the mapping proper), then a store 4095 bytes below that.