How do shared pointers work?

I generally agree with James McNellis's answer. However there's one more point that should be mentioned.

As you may know, shared_ptr<T> may also be used when the type T is not fully defined.

That is:

class AbraCadabra;

boost::shared_ptr<AbraCadabra> myPtr;
// ...

This will compile & work. Unlike many other implementations of smart pointers, which actually demand the encapsulated type to be fully defined in order to use them. This is related to the fact that the smart pointer is supposed to know to delete the encapsulated object when it's no more referenced, and in order to delete an object one must know what it is.

This is achieved by the following trick: shared_ptr actually consists of the following:

  1. An opaque pointer to the object
  2. Shared reference counters (what James McNellis described)
  3. A pointer to the allocated factory that knows how to destroy your object.

The above factory is a helper object with a single virtual function, which is supposed to delete your object in a correct way.

This factory is actually created when you assign a value to your shared pointer.

That is, the following code

AbraCadabra* pObj = /* get it from somewhere */;
myPtr.reset(pObj);

This is where this factory is allocated. Note: the reset function is actually a template function. It actually creates the factory for the specified type (type of the object passed as a parameter). This is where your type should be fully defined. That is, if it's still not defined - you'll get a compilation error.

Note also: if you actually create an object of a derived type (derived from AbraCadabra), and assign it to the shared_ptr - it will be deleted in a correct way even if your destructor is not virtual. The shared_ptr will always delete the object according to the type that is sees in reset function.

So that shared_ptr is a pretty sophisticated variant of a smart pointer. It gives an awesome flexibility. However you should know that this flexibility comes at a price of an extremely bad performance compared to other possible implementations of the smart pointer.

On the other hand - there're so-called "intrusive" smart pointers. They don't have all that flexibility, however in contrast they give the best performance.

Pros of shared_ptr compared to inrusive smart pointers:

  • Very flexible usage. Only have to define the encapsulated type when assigning it to the shared_ptr. This is very valuable for big projects, reduces dependencies greatly.
  • The encapsulated type doesn't have to have a virtual destructor, still polymorphic types will be deleted correctly.
  • Can be used with weak pointers.

Cons of shared_ptr compared to inrusive smart pointers:

  1. Very barbaric performance and waste of heap memory. On assignment allocates 2 more objects: reference counters, plus the factory (waste of memory, slow). This however happens only on reset. When one shared_ptr is assigned to another one - nothing more is allocated.
  2. The above may throw an exception. (out-of-memory condition). In contrast intrsusive smart pointers may never throw (apart from process exceptions related to invalid memory access, stack overflow and etc.)
  3. Deletion of your object is also slow: need to deallocate another two structs.
  4. When working with intrusive smart pointers you may freely mix smart pointers with raw ones. This is ok because the actual reference counting resides inside the object itself, which is single. In contrast - with shared_ptr you may not mix with raw pointers.
    AbraCadabra* pObj = /* get it from somewhere */;
    myPtr.reset(pObj);
    // ...
    pObj = myPtr.get();
    boost::shared_ptr<AbraCadabra> myPtr2(pObj); // oops

The above will crash.


Basically, shared_ptr has two pointers: a pointer to the shared object and a pointer to a struct containing two reference counts: one for "strong references," or references that have ownership, and one for "weak references," or references that don't have ownership.

When you copy a shared_ptr, the copy constructor increments the strong reference count. When you destroy a shared_ptr, the destructor decrements the strong reference count and tests whether the reference count is zero; if it is, the destructor deletes the shared object because no shared_ptrs point to it anymore.

The weak reference count is used to support weak_ptr; basically, any time a weak_ptr is created from the shared_ptr, the weak reference count is incremented, and any time one is destroyed the weak reference count is decremented. As long as either the strong reference count or the weak reference count is greater than zero, the reference count struct will not be destroyed.

Effectively, as long as the strong reference count is greater than zero, the shared object will not be deleted. As long as the strong reference count or the weak reference count is not zero, the reference count struct will not be deleted.


There are at least three well-known mechanisms.

External Counters

When the first shared pointer to an object is created, a separate reference count object is created and initialized to 1. When the pointer is copied, the reference count is increased; when a pointer is destroyed it is decreased. Pointer assignment increases one count and decreases another (in that order, or else self-assignment ptr=ptr will break). If the reference count hits zero, no more pointers exist and the object is deleted.

Internal counters

An internal counter requires that the object pointed to has a counter field. This is usually achieved by deriving from a specific base class. In exchange, this saves a heap allocation of the reference count, and it allows repeated creation of shared pointers from raw pointers (with external counters, you'd end up with two counts for one object)

Circular links

Instead of using a counter, you can keep all shared pointers to an object in a circular graph. The first pointer created points to itself. When you copy a pointer, you insert the copy in the circle. When you delete it, you remove it from the circle. But when the destroyed pointer pointed to itself, i.e. when it's the only pointer, you delete the pointed-to object.

The downside is that removing a node from a circular single-linked list is rather expensive as you have to iterate over all nodes to find the predecessor. This can be especially painful due to poor locality of reference.

Variations

The 2nd and 3rd idea can be combined: the base class can be part of that circular graph, instead of containing a count. Of course, this means that the object can be deleted only when it points to itself (cycle length 1, no remaining pointers to it). Again, the advantage is that you can create smart pointers from weak pointers, but the poor performance of deleting a pointer from the chain remains an issue.

The exact graph structure for idea 3 doesn't matter too much. You could also create a binary tree structure, with the pointed-to object at the root. Again, the hard operation is removing a shared pointer node from that graph. The benefit is that if you have many pointers on many threads, growing part of the graph is not a highly contended operation.