Is ‘int main;’ a valid C/C++ program?
Is int main;
a valid C/C++ program?
It is not entirely clear what a C/C++ program is.
Is int main;
a valid C program?
Yes. A freestanding implementation is allowed to accept such program. main
doesn't have to have any special meaning in a freestanding environment.
It is not valid in a hosted environment.
Is int main;
a valid C++ program?
Ditto.
Why does it crash?
The program doesn't have to make sense in your environment. In a freestanding environment the program startup and termination, and the meaning of main
, are implementation-defined.
Why does the compiler warn me?
The compiler may warn you about whatever it pleases, as long as it doesn't reject conforming programs. On the other hand, warning is all that's required to diagnose a non-conforming program. Since this translation unit cannot be a part of a valid hosted program, a diagnostic message is justified.
Is gcc
a freestanding environment, or is it a hosted environment?
Yes.
gcc
documents the -ffreestanding
compilation flag. Add it, and the warning goes away. You may want to use it when building e.g. kernels or firmware.
g++
doesn't document such flag. Supplying it seems to have no effect on this program. It is probably safe to assume that the environment provided by g++ is hosted. Absence of diagnostic in this case is a bug.
Since the question is double-tagged as C and C++, the reasoning for C++ and C would be different:
- C++ uses name mangling to help linker distinguish between textually identical symbols of different types, e.g. a global variable
xyz
and a free-standing global functionxyz(int)
. However, the namemain
is never mangled. - C does not use mangling, so it is possible for a program to confuse linker by providing a symbol of one kind in place of a different symbol, and have the program successfully link.
That is what's going on here: the linker expects to find symbol main
, and it does. It "wires" that symbol as if it were a function, because it does not know any better. The portion of runtime library that passes control to main
asks linker for main
, so linker gives it symbol main
, letting the link phase to complete. Of course this fails at runtime, because main
is not a function.
Here is another illustration of the same issue:
file x.c:
#include <stdio.h>
int foo(); // <<== main() expects this
int main(){
printf("%p\n", (void*)&foo);
return 0;
}
file y.c:
int foo; // <<== external definition supplies a symbol of a wrong kind
compiling:
gcc x.c y.c
This compiles, and it would probably run, but it's undefined behavior, because the type of the symbol promised to the compiler is different from the actual symbol supplied to the linker.
As far as the warning goes, I think it is reasonable: C lets you build libraries that have no main
function, so the compiler frees up the name main
for other uses if you need to define a variable main
for some unknown reason.
main
isn't a reserved word it's just a predefined identifier (like cin
, endl
, npos
...), so you could declare a variable called main
, initialize it and then print out its value.
Of course:
- the warning is useful since this is quite error prone;
- you can have a source file without the
main()
function (libraries).
EDIT
Some references:
main
is not a reserved word (C++11):The function
main
shall not be used within a program. The linkage (3.5) ofmain
is implementation-defined. A program that defines main as deleted or that declares main to beinline
,static
, orconstexpr
is ill-formed. The namemain
is not otherwise reserved. [ Example: member functions, classes and enumerations can be calledmain
, as can entities in other namespaces. — end example ]C++11 - [basic.start.main] 3.6.1.3
[2.11/3] [...] some identifiers are reserved for use by C++ implementations and standard libraries (17.6.4.3.2) and shall not be used otherwise; no diagnostic is required.
[17.6.4.3.2/1] Certain sets of names and function signatures are always reserved to the implementation:
- Each name that contains a double underscore __ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.
- Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
Reserved words in programming languages.
Reserved words may not be redefined by the programmer, but predefineds can often be overridden in some capacity. This is the case of
main
: there are scopes in which a declaration using that identifier redefines its meaning.
It is a warning as it is not technically disallowed. The startup code will use the symbol location of "main" and jump to it with the three standard arguments (argc, argv and envp). It does not, and at link time cannot check that it's actually a function, nor even that it has those arguments. This is also why int main(int argc, char **argv) works - the compiler doesn't know about the envp argument and it just happens not to be used, and it is caller-cleanup.
As a joke, you could do something like
int main = 0xCBCBCBCB;
on an x86 machine and, ignoring warnings and similar stuff, it will not just compile but actually work too.
Somebody used a technique similar to this to write an executable (sort of) that runs on multiple architectures directly - http://phrack.org/issues/57/17.html#article . It was also used to win the IOCCC - http://www.ioccc.org/1984/mullender/mullender.c .