How is the sizeof operator implemented in c++?

http://en.wikipedia.org/wiki/Sizeof

Basically, to quote Bjarne Stroustrup's C++ FAQ:

Sizeof cannot be overloaded because built-in operations, such as incrementing a pointer into an array implicitly depends on it. Consider:

X a[10];
X* p = &a[3];
X* q = &a[3];
p++;    // p points to a[4]
    // thus the integer value of p must be
    // sizeof(X) larger than the integer value of q

Thus, sizeof(X) could not be given a new and different meaning by the programmer without violating basic language rules.


sizeof is not a real operator in C++. It is merely special syntax which inserts a constant equal to the size of the argument. sizeof doesn't need or have any runtime support.

Edit: do you want to know how to determine the size of a class/structure looking at its definition? The rules for this are part of the ABI, and compilers merely implement them. Basically the rules consist of

  1. size and alignment definitions for primitive types;
  2. structure, size and alignment of the various pointers;
  3. rules for packing fields in structures;
  4. rules about virtual table-related stuff (more esoteric).

However, ABIs are platform- and often vendor-specific, i.e. on x86 and (say) IA64 the size of A below will be different because IA64 does not permit unaligned data access.

struct A
{
    char i ;
    int  j ;
} ;

assert (sizeof (A) == 5)  ; // x86, MSVC #pragma pack(1)
assert (sizeof (A) == 8)  ; // x86, MSVC default
assert (sizeof (A) == 16) ; // IA64

No, you can't change it. What do you hope to learn from seeing an implementation of it?

What sizeof does can't be written in C++ using more basic operations. It's not a function, or part of a library header like e.g. printf or malloc. It's inside the compiler.

Edit: If the compiler is itself written in C or C++, then you can think of the implementation being something like this:

size_t calculate_sizeof(expression_or_type)
{
   if (is_type(expression_or_type))
   {
       if (is_array_type(expression_or_type))
       {
           return array_size(exprssoin_or_type) * 
             calculate_sizeof(underlying_type_of_array(expression_or_type));
       }
       else
       {
           switch (expression_or_type)
           {
                case int_type:
                case unsigned_int_type:
                     return 4; //for example
                case char_type:
                case unsigned_char_type:
                case signed_char_type:
                     return 1;
                case pointer_type:
                     return 4; //for example

                //etc., for all the built-in types
                case class_or_struct_type:
                {
                     int base_size = compiler_overhead(expression_or_type);
                     for (/*loop over each class member*/)
                     {
                          base_size += calculate_sizeof(class_member) +
                              padding(class_member);
                     }
                     return round_up_to_multiple(base_size,
                              alignment_of_type(expression_or_type));
                }
                case union_type:
                {
                     int max_size = 0;
                     for (/*loop over each class member*/)
                     {
                          max_size = max(max_size, 
                             calculate_sizeof(class_member));
                     }
                     return round_up_to_multiple(max_size,
                            alignment_of_type(expression_or_type));
                }
           }
       }
   }
   else
   {
       return calculate_sizeof(type_of(expression_or_type));
   }
}

Note that is is very much pseudo-code. There's lots of things I haven't included, but this is the general idea. The compiler probably doesn't actually do this. It probably calculates the size of a type (including a class) and stores it, instead of recalculating every time you write sizeof(X). It is also allowed to e.g. have pointers being different sizes depending on what they point to.

Tags:

C++

Sizeof