How to write custom input stream in C++

boost (which you should have already if you're serious about C++), has a whole library dedicated to extending and customizing IO streams: boost.iostreams

In particular, it already has decompressing streams for a few popular formats (bzip2, gzlib, and zlib)

As you saw, extending streambuf may be an involving job, but the library makes it fairly easy to write your own filtering streambuf if you need one.


Don't, unless you want to die a terrible death of hideous design. IOstreams are the worst component of the Standard library - even worse than locales. The iterator model is much more useful, and you can convert from stream to iterator with istream_iterator.


The proper way to create a new stream in C++ is to derive from std::streambuf and to override the underflow() operation for reading and the overflow() and sync() operations for writing. For your purpose you'd create a filtering stream buffer which takes another stream buffer (and possibly a stream from which the stream buffer can be extracted using rdbuf()) as argument and implements its own operations in terms of this stream buffer.

The basic outline of a stream buffer would be something like this:

class compressbuf
    : public std::streambuf {
    std::streambuf* sbuf_;
    char*           buffer_;
    // context for the compression
public:
    compressbuf(std::streambuf* sbuf)
        : sbuf_(sbuf), buffer_(new char[1024]) {
        // initialize compression context
    }
    ~compressbuf() { delete[] this->buffer_; }
    int underflow() {
        if (this->gptr() == this->egptr()) {
            // decompress data into buffer_, obtaining its own input from
            // this->sbuf_; if necessary resize buffer
            // the next statement assumes "size" characters were produced (if
            // no more characters are available, size == 0.
            this->setg(this->buffer_, this->buffer_, this->buffer_ + size);
        }
        return this->gptr() == this->egptr()
             ? std::char_traits<char>::eof()
             : std::char_traits<char>::to_int_type(*this->gptr());
    }
};

How underflow() looks exactly depends on the compression library being used. Most libraries I have used keep an internal buffer which needs to be filled and which retains the bytes which are not yet consumed. Typically, it is fairly easy to hook the decompression into underflow().

Once the stream buffer is created, you can just initialize an std::istream object with the stream buffer:

std::ifstream fin("some.file");
compressbuf   sbuf(fin.rdbuf());
std::istream  in(&sbuf);

If you are going to use the stream buffer frequently, you might want to encapsulate the object construction into a class, e.g., icompressstream. Doing so is a bit tricky because the base class std::ios is a virtual base and is the actual location where the stream buffer is stored. To construct the stream buffer before passing a pointer to a std::ios thus requires jumping through a few hoops: It requires the use of a virtual base class. Here is how this could look roughly:

struct compressstream_base {
    compressbuf sbuf_;
    compressstream_base(std::streambuf* sbuf): sbuf_(sbuf) {}
};
class icompressstream
    : virtual compressstream_base
    , public std::istream {
public:
    icompressstream(std::streambuf* sbuf)
        : compressstream_base(sbuf)
        , std::ios(&this->sbuf_)
        , std::istream(&this->sbuf_) {
    }
};

(I just typed this code without a simple way to test that it is reasonably correct; please expect typos but the overall approach should work as described)

Tags:

C++

Iostream