Choosing the right subclass to instantiate programmatically

One way of approaching it would be using a map and register some kind of factory function for each message type. This means that you get rid of the switch case and can add and remove messages dynamically.

The code would look something like:

// Create the map (most likely a member in a different class)
std::map<BaseMessage::Type, MessageCreator*> messageMap;
...

// Register some message types
// Note that you can add and remove messages at runtime here
messageMap[BaseMessage::MyMessageA] = new MessageCreatorT<BaseMessageA>();
messageMap[BaseMessage::MyMessageB] = new MessageCreatorT<BaseMessageB>();
...

// Handle a message
std::map<Type, MessageCreator*>::const_iterator it = messageMap.find(msgType);
if(it == messageMap.end()) {
    // Unknown message type
    beepHang();
}
// Now create the message
BaseMessage* msg = it->second.createMessage(data);

The MessageCreator class would look something like this:

class MessageCreator {
    public:
    virtual BaseMessage* createMessage(void* data, size_t len) const = 0;
};
template<class T> class MessageCreatorT : public MessageCreator {
    public:
    BaseMessage* createMessage(void* data, size_t len) const {
        T* newMessage = new T();
        newMessage.parse(data, len);
        return newMessage;
    }
};

It's a pretty basic question in fact (as you can imagine, you are definitely not the only one deserializing in C++).

What you are looking for is called Virtual Construction.

C++ does not define Virtual Construction, but it's easy to approximate it using the Prototype Design Pattern or using a Factory method.

I personnally prefer the Factory approach, for the reason that the Prototype one means having some kind of default instance that is replicated and THEN defined... the problem is that not all classes have a meaningful default, and for that matter, a meaningful Default Constructor.

The Factory approach is easy enough.

  • You need a common base class for the Messages, and another for the Parsers
  • Each Message has both a Tag and an associated Parser

Let's see some code:

// Framework
class Message
{
public:
  virtual ~Message();
};

class Parser
{
public:
  virtual ~Parser();
  virtual std::auto_ptr<Message> parse(std::istream& serialized) const;
};

// Factory of Messages
class MessageFactory
{
public:
  void register(std::string const& tag, Parser const& parser);
  std::auto_ptr<Message> build(std::string const& tag, std::istream& serialized) const;
private:
  std::map<std::string,Parser const*> m_parsers;
};

And with this framework (admittedly simple), some derived classes:

class MessageA: public Message
{
public:
  MessageA(int a, int b);
};

class ParserA: public Parser
{
public:
  typedef std::auto_ptr<MessageA> result_type;
  virtual result_type parse(std::istream& serialized) const
  {
    int a = 0, b = 0;
    char space = 0;
    std::istream >> a >> space >> b;
    // Need some error control there
    return result_type(new MessageA(a,b));
  }
};

And at last, the use:

int main(int argc, char* argv[])
{
  // Register the parsers
  MessageFactory factory;
  factory.register("A", ParserA());

  // take a file
  // which contains 'A 1 2\n'
  std::ifstream file = std::ifstream("file.txt");
  std::string tag;
  file >> tag;
  std::auto_ptr<Message> message = factory.parse(tag, file);

  // message now points to an instance of MessageA built by MessageA(1,2)
}

It works, I know for I use it (or a variation).

There are some things to consider:

  • You may be willing to make MessageFactory a singleton, this then allows it to be called at library load, and thus you can register your parsers by instantiating static variables. This is very handy if you don't want main to have to register every single parser type: locality > less dependencies.
  • The tags have to be shared. It is not unusual either for the tag to be served by a virtual method of the Message class (called tag).

Like:

class Message
{
public:
  virtual ~Message();
  virtual const std::string& tag() const = 0;
  virtual void serialize(std::ostream& out) const;
};
  • The logic for serialization has to be shared too, it is not unusual for an object to handle its own serialization/deserialization

Like:

class MessageA: public Message
{
public:
  static const std::string& Tag();
  virtual const std::string& tag() const;
  virtual void serialize(std::ostream& out) const;

  MessageA(std::istream& in);
};

template <class M>
class ParserTemplate: public Parser // not really a parser now...
{
public:
  virtual std::auto_ptr<M> parse(std::istream& in) const
  {
    return std::auto_ptr<M>(new M(in));
  }
};

What's great with templates is that it never stops to amaze me

class MessageFactory
{
public:
  template <class M>
  void register()
  {
    m_parsers[M::Tag()] = new ParserTemplate<M>();
  }
};

//skipping to registration
  factory.register<MessageA>();

Now isn't it pretty :) ?