pattern to avoid dynamic_cast

According to your comments, what you have stumbled upon is known (dubiously) as the Expression Problem, as expressed by Philip Wadler:

The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).

That is, extending both "vertically" (adding types to the hierarchy) and "horizontally" (adding functions to be overriden to the base class) is hard on the programmer.

There was a long (as always) discussion about it on Reddit in which I proposed a solution in C++.

It is a bridge between OO (great at adding new types) and generic programming (great at adding new functions). The idea is to have a hierachy of pure interfaces and a set of non-polymorphic types. Free-functions are defined on the concrete types as needed, and the bridge with the pure interfaces is brought by a single template class for each interface (supplemented by a template function for automatic deduction).

I have found a single limitation to date: if a function returns a Base interface, it may have been generated as-is, even though the actual type wrapped supports more operations, now. This is typical of a modular design (the new functions were not available at the call site). I think it illustrates a clean design, however I understand one could want to "recast" it to a more verbose interface. Go can, with language support (basically, runtime introspection of the available methods). I don't want to code this in C++.

As already explained myself on reddit... I'll just reproduce and tweak the code I already submitted there.

So, let's start with 2 types and a single operation.

struct Square { double side; };
double area(Square const s);

struct Circle { double radius; };
double area(Circle const c);

Now, let's make a Shape interface:

class Shape {
public:
   virtual ~Shape();

   virtual double area() const = 0;

protected:
   Shape(Shape const&) {}
   Shape& operator=(Shape const&) { return *this; }
};

typedef std::unique_ptr<Shape> ShapePtr;

template <typename T>
class ShapeT: public Shape {
public:
   explicit ShapeT(T const t): _shape(t) {}

   virtual double area() const { return area(_shape); }

private:
  T _shape;
};

template <typename T>
ShapePtr newShape(T t) { return ShapePtr(new ShapeT<T>(t)); }

Okay, C++ is verbose. Let's check the use immediately:

double totalArea(std::vector<ShapePtr> const& shapes) {
   double total = 0.0;
   for (ShapePtr const& s: shapes) { total += s->area(); }
   return total;
}

int main() {
  std::vector<ShapePtr> shapes{ new_shape<Square>({5.0}), new_shape<Circle>({3.0}) };

  std::cout << totalArea(shapes) << "\n";
}

So, first exercise, let's add a shape (yep, it's all):

struct Rectangle { double length, height; };
double area(Rectangle const r);

Okay, so far so good, let's add a new function. We have two options.

The first is to modify Shape if it is in our power. This is source compatible, but not binary compatible.

// 1. We need to extend Shape:
  virtual double perimeter() const = 0

// 2. And its adapter: ShapeT
  virtual double perimeter() const { return perimeter(_shape); }

// 3. And provide the method for each Shape (obviously)
double perimeter(Square const s);
double perimeter(Circle const c);
double perimeter(Rectangle const r);

It may seem that we fall into the Expression Problem here, but we don't. We needed to add the perimeter for each (already known) class because there is no way to automatically infer it; however it did not require editing each class either!

Therefore, the combination of External Interface and free functions let us neatly (well, it is C++...) sidestep the issue.

sodraz noticed in comments that the addition of a function touched the original interface which may need to be frozen (provided by a 3rd party, or for binary compatibility issues).

The second options therefore is not intrusive, at the cost of being slightly more verbose:

class ExtendedShape: public Shape {
public:
  virtual double perimeter() const = 0;
protected:
  ExtendedShape(ExtendedShape const&) {}
  ExtendedShape& operator=(ExtendedShape const&) { return *this; }
};

typedef std::unique_ptr<ExtendedShape> ExtendedShapePtr;

template <typename T>
class ExtendedShapeT: public ExtendedShape {
public:
   virtual double area() const { return area(_data); }
   virtual double perimeter() const { return perimeter(_data); }
private:
  T _data;
};

template <typename T>
ExtendedShapePtr newExtendedShape(T t) { return ExtendedShapePtr(new ExtendedShapeT<T>(t)); }

And then, define the perimeter function for all those Shape we would like to use with the ExtendedShape.

The old code, compiled to work against Shape, still works. It does not need the new function anyway.

The new code can make use of the new functionality, and still interface painlessly with the old code. (*)

There is only one slight issue, if the old code return a ShapePtr, we do not know whether the shape actually has a perimeter function (note: if the pointer is generated internally, it has not been generated with the newExtendedShape mechanism). This is the limitation of the design mentioned at the beginning. Oops :)

(*) Note: painlessly implies that you know who the owner is. A std::unique_ptr<Derived>& and a std::unique_ptr<Base>& are not compatible, however a std::unique_ptr<Base> can be build from a std::unique_ptr<Derived> and a Base* from a Derived* so make sure your functions are clean ownership-wise and you're golden.

Someone intelligent (unfortunately I forgot who) once said about OOP in C++: The only reason for switch-ing over types (which is what all your suggestions propose) is fear of virtual functions. (That's para-paraphrasing.) Add virtual functions to your base class which derived classes can override, and you're set.
Now, I know there are cases where this is hard or unwieldy. For that we have the visitor pattern.

There's cases where one is better, and cases where the other is. Usually, the rule of thumb goes like this:

If you have a rather fixed set of operations, but keep adding types, use virtual functions.
Operations are hard to add to/remove from a big inheritance hierarchy, but new types are easy to add by simply having them override the appropriate virtual functions.
If you have a rather fixed set of types, but keep adding operations, use the visitor pattern.
Adding new types to a large set of visitors is a serious pain in the neck, but adding a new visitor to a fixed set of types is easy.

(If both change, you're doomed either way.)

pattern to avoid dynamic_cast

Tags:

C++

Oop

C++11

Related

Recent Posts