Why compiler doesn't allow std::string inside union?

Because having a class with a non-trivial (copy/)constructor in a union doesn't make sense. Suppose we have

union U {
  string x;
  vector<int> y;
};

U u;  // <--

If U was a struct, u.x and u.y would be initialized to an empty string and empty vector respectively. But members of a union share the same address. So, if u.x is initialized, u.y will contain invalid data, and so is the reverse. If both of them are not initialized then they cannot be used. In any case, having these data in a union cannot be handled easily, so C++98 chooses to deny this: (§9.5/1):

An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects.

In C++0x this rule has been relaxed (§9.5/2):

At most one non-static data member of a union may have a brace-or-equal-initializer. [Note: if any non-static data member of a union has a non-trivial default constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8), move assignment operator (12.8), or destructor (12.4), the corresponding member function of the union must be user-provided or it will be implicitly deleted (8.4.3) for the union. — end note ]

but it is still a not possible to create (correct) con/destructors for the union, e.g. how do you or the compiler write a copy constructor for the union above without extra information? To ensure which member of the union is active, you need a tagged union, and you need to handle the construction and destruction manually e.g.

struct TU {
   int type;
   union {
     int i;
     float f;
     std::string s;
   } u;

   TU(const TU& tu) : type(tu.type) {
     switch (tu.type) {
       case TU_STRING: new(&u.s)(tu.u.s); break;
       case TU_INT:    u.i = tu.u.i;      break;
       case TU_FLOAT:  u.f = tu.u.f;      break;
     }
   }
   ~TU() {
     if (tu.type == TU_STRING)
       u.s.~string();
   }
   ...
};

But, as @DeadMG has mentioned, this is already implemented as boost::variant or boost::any.


Think about it. How does the compiler know what type is in the union?

It doesn't. The fundamental operation of a union is essentially a bitwise cast. Operations on values contained within unions are only safe when each type can essentially be filled with garbage. std::string can't, because that would result in memory corruption. Use boost::variant or boost::any.


In C++98/03, members of a union can't have constructors, destructors, virtual member functions, or base classes.

So basically, you can only use built-in data types, or PODs

Note that it is changing in C++0x: Unrestricted unions

union {
    int z;
    double w;
    string s;  // Illegal in C++98, legal in C++0x.
};

Tags:

C++