Fast display of waveform in C/C++

After reading Peter Stock's answer, I've come up with the following scheme. I think it will allow display calculation about 500 times faster than the naive scheme and shouldn't add any noticeable cost to inserts or deletes. The memory overhead is less than 1%.

The sound data will be allocated in blocks of 131072 samples, so that inserts and deletes don't require the entire sound to be reallocated and copied. When the sound is first loaded, each block will be completely filled (except probably the last one). Inserts and deletes will lead to a kind of fragmentation. For simplicity, I will arrange for the start of each block to always contain valid sample data, and any gaps will be at the end of the block.

Each block has two look-up tables associated with it, one for max values and one for min. Each item in the look-up tables corresponds to 1024 samples.

The diagram below shows how to calculate the max value for one pixel width of the display. It shows a few blocks relevant to the calculation. It assumes there is no "fragmentation".

Display calculation for one pixel width (no fragmentation)

After an insert, the situation is slightly more complicated. Two blocks now have invalid regions at their ends. There are entries in the max look-up table that now corresponds to a part-empty region of samples. The value for these entries are found by just taking the max of the samples that are present.

Display calculation for one pixel width (with fragmentation)


When the zoom is at the point where you have multiple samples per pixel it is not worth calculating accurately the mean sample value for each pixel. The user can't align the GUI tooling accurately at that level of zoom so it's no benefit. The user just needs a qualitative view.

I would just select one sample per screen pixel for the window area, skipping over the unnecessary samples.

Something like this completely untested code:

std::vector<double> samples(1024*1024); // [-1.0 < s < 1.0]

int window_x = 1024; // window size in pixels
int window_y = 768; // window size in pixels

// visit every window pixel
for(int x = 0; x < window_x; ++x)
{
    // select relevant sample for the current screen pixel x
    double s = samples[(x * samples.size()) / window_x];

    int y = (window_y / 2) * s; // get y size for sample value

    // draw sample point/line at coordinate (x, f(y))
    gd.draw_line(x, (window_y / 2) - y, x, (window_y / 2) + y);
}

Obviously you also need to account for window scrolling etc...


Maybe you could use the mip-mapping technique from graphics, trading using more memory for faster speed?

If you have 32 samples, maintain a cache of zoomed out x2, x4, x8, ... Storing this data will take the same space again as the original data (16 + 8 + 4 + 2 + 1 samples).

A visual guide, with . representing a stored data point (min/max sample value) and _ the samples covered by the previous .:

1st level: ................
2nd level: ._._._._._._._._
3rd level: .___.___.___.___
4th level: ._______._______
5th level: ._______________

Then just query the appropriate level mip-map for the zoom level.

Yes, you would have to re-create the mip-map cache (or part of it) when you insert/remove samples.

But maybe the memory usage makes this not appropriate for you?


Edit

If adding and removing is a frequent operation and makes re-calculation of the cache undesirable (and you want accurate downsampling over intervals rather than just at single points), then you could change the mip-mapping approach to store the data aligned to the local min/max sample points rather than a time-based grid.

Using --------|-------- to denote a local min/max over an interval, here's a pictorial representation:

                             --------|--------
   --------|--------
                   --------|--------
                                     --------|--
------|--------
                                     .
           .                        . .
.     .   . .   .          .       .   .     .
 . ... . .   . . .   . .. . .     .     .   . .
  .     .     .   . . .  .   .   .       . .   .
                   .          . .         .
                               .
--------|--------
           --------|--------
                                  --------|-----
                       --------|--------

Then adding and removing only requires a re-calculation of the immediate local areas at the start and end of the added/removed section.

You'll probably want to index the local min/max values, so you don't need to do a lot of searching. A more complex scheme to implement - maybe not worth it for you?