How does typeid work and how do objects store class information?
Even when you do not use type information, an empty class will not have zero bytes, it always has something, if I remember correct the standard demands that.
I believe the typeid is implemented similar to a vtable pointer, the object will have a "hidden" pointer to its typeid.
How it is stored is implementation-defined. There are many completely different ways to do it.
However, for non-polymorphic types nothing needs to be stored. For non-polymorphic types typeid
returns information about the static type of the expression, i.e. its compile-time type. The type is always known at compile-time, so there's no need to associate any additional information with specific objects (just like for sizeof
to work you don't really need to store the object size anywhere). "An empty object" that you mention in your question would be an object of non-polymorphic type, so there's no need to store anything in it and there's no problem with it having zero size. (Meanwhile, polymorphic objects are never really "empty" and never have "zero size in memory".)
For polymorphic types typeid
does indeed return the information about the dynamic type of the expression, i.e. about its run-time type. To implement this something has to be stored inside the actual object at run-time. As I said above, different compilers implement it differently. In MSVC++, for one example, the VMT pointer stored in each polymorphic object points to a data structure that contains the so called RTTI - run-time type information about the object - in addition to the actual VMT.
The fact that you mention zero size objects in your question probably indicates that you have some misconceptions about what typeid
can and cannot do. Remember, again, typeid
is capable of determining the actual (i.e. dynamic) type of the object for polymorphic types only. For non-polymorphic types typeid
cannot determine the actual type of the object and reverts to primitive compile-time functionality.
Imagine every class as if it has this virtual method, but only if it already has one other virtual, and one object is created for each type:
extern std::type_info __Example_info;
struct Example {
virtual std::type_info const& __typeid() const {
return __Example_info;
}
};
// "__" used to create reserved names in this pseudo-implementation
Then imagine any use of typeid on an object, typeid(obj)
, becomes obj.__typeid()
. Use on pointers similarly becomes pointer->__typeid()
. Except for use on null pointers (which throws bad_typeid), the pointer case is identical to the non-pointer case after dereferencing, and I won't mention it further. When applied directly on a type, imagine that the compiler inserts a reference directly to the required object: typeid(Example)
becomes __Example_info
.
If a class does not have RTTI (i.e. it has no virtuals; e.g. NoRTTI below), then imagine it with an identical __typeid method that is not virtual. This allows the same transformation into method calls as above, relying on virtual or non-virtual dispatch of those methods, as appropriate; it also allows some virtual method calls to be transformed into non-virtual dispatch, as can be performed for any virtual method.
struct NoRTTI {}; // a hierarchy can mix RTTI and no-RTTI, just as use of
// virtual methods can be in a derived class even if the base
// doesn't contain any
struct A : NoRTTI { virtual ~A(); }; // one virtual required for RTTI
struct B : A {}; // ~B is virtual through inheritance
void typeid_with_rtti(A &a, B &b) {
typeid(a); typeid(b);
A local_a; // no RTTI required: typeid(local_a);
B local_b; // no RTTI required: typeid(local_b);
A &ref = local_b;
// no RTTI required, if the compiler is smart enough: typeid(ref)
}
Here, typeid must use RTTI for both parameters (B could be a base class for a later type), but does not need RTTI for either local variable because the dynamic type (or "runtime type") is absolutely known. This matches, not coincidentally, how virtual calls can avoid virtual dispatch.
struct StillNoRTTI : NoRTTI {};
void typeid_without_rtti(NoRTTI &obj) {
typeid(obj);
StillNoRTTI derived; typeid(derived);
NoRTTI &ref = derived; typeid(ref);
// typeid on types never uses RTTI:
typeid(A); typeid(B); typeid(NoRTTI); typeid(StillNoRTTI);
}
Here, use on either obj or ref will correspond to NoRTTI! This is true even though the former may be of a derived class (obj could really be an instance of A or B) and even though ref is definitely of a derived class. All of the other uses (the last line of the function) will also be resolved statically.
Note that in these example functions, each typeid uses RTTI or not as the function name implies. (Hence the commented-out uses in with_rtti.)