Difference between a Structure and a Union
With a union, you're only supposed to use one of the elements, because they're all stored at the same spot. This makes it useful when you want to store something that could be one of several types. A struct, on the other hand, has a separate memory location for each of its elements and they all can be used at once.
To give a concrete example of their use, I was working on a Scheme interpreter a little while ago and I was essentially overlaying the Scheme data types onto the C data types. This involved storing in a struct an enum indicating the type of value and a union to store that value.
union foo {
int a; // can't use both a and b at once
char b;
} foo;
struct bar {
int a; // can use both a and b simultaneously
char b;
} bar;
union foo x;
x.a = 3; // OK
x.b = 'c'; // NO! this affects the value of x.a!
struct bar y;
y.a = 3; // OK
y.b = 'c'; // OK
edit: If you're wondering what setting x.b to 'c' changes the value of x.a to, technically speaking it's undefined. On most modern machines a char is 1 byte and an int is 4 bytes, so giving x.b the value 'c' also gives the first byte of x.a that same value:
union foo x;
x.a = 3;
x.b = 'c';
printf("%i, %i\n", x.a, x.b);
prints
99, 99
Why are the two values the same? Because the last 3 bytes of the int 3 are all zero, so it's also read as 99. If we put in a larger number for x.a, you'll see that this is not always the case:
union foo x;
x.a = 387439;
x.b = 'c';
printf("%i, %i\n", x.a, x.b);
prints
387427, 99
To get a closer look at the actual memory values, let's set and print out the values in hex:
union foo x;
x.a = 0xDEADBEEF;
x.b = 0x22;
printf("%x, %x\n", x.a, x.b);
prints
deadbe22, 22
You can clearly see where the 0x22 overwrote the 0xEF.
BUT
In C, the order of bytes in an int are not defined. This program overwrote the 0xEF with 0x22 on my Mac, but there are other platforms where it would overwrite the 0xDE instead because the order of the bytes that make up the int were reversed. Therefore, when writing a program, you should never rely on the behavior of overwriting specific data in a union because it's not portable.
For more reading on the ordering of bytes, check out endianness.
Here's the short answer: a struct is a record structure: each element in the struct allocates new space. So, a struct like
struct foobarbazquux_t {
int foo;
long bar;
double baz;
long double quux;
}
allocates at least (sizeof(int)+sizeof(long)+sizeof(double)+sizeof(long double))
bytes in memory for each instance. ("At least" because architecture alignment constraints may force the compiler to pad the struct.)
On the other hand,
union foobarbazquux_u {
int foo;
long bar;
double baz;
long double quux;
}
allocates one chunk of memory and gives it four aliases. So sizeof(union foobarbazquux_u) ≥ max((sizeof(int),sizeof(long),sizeof(double),sizeof(long double))
, again with the possibility of some addition for alignments.
Is there any good example to give the difference between a 'struct' and a 'union'?
An imaginary communications protocol
struct packetheader {
int sourceaddress;
int destaddress;
int messagetype;
union request {
char fourcc[4];
int requestnumber;
};
};
In this imaginary protocol, it has been sepecified that, based on the "message type", the following location in the header will either be a request number, or a four character code, but not both. In short, unions allow for the same storage location to represent more than one data type, where it is guaranteed that you will only want to store one of the types of data at any one time.
Unions are largely a low-level detail based in C's heritage as a system programming language, where "overlapping" storage locations are sometimes used in this way. You can sometimes use unions to save memory where you have a data structure where only one of several types will be saved at one time.
In general, the OS doesn't care or know about structs and unions -- they are both simply blocks of memory to it. A struct is a block of memory that stores several data objects, where those objects don't overlap. A union is a block of memory that stores several data objects, but has only storage for the largest of these, and thus can only store one of the data objects at any one time.