co_await appears to be suboptimal?

The "coroutine" system defined by the Coroutine TS is designed to handle asynchronous functions which:

  1. Return a future-like object (an object which represents a delayed return value).
  2. The future-like object has the ability to be associated with a continuation function.

async_foo doesn't fulfill these requirements. It doesn't return a future-like object; it "returns" a value via a continuation function. And this continuation is passed as a parameter, rather than being something you do with the object's return type.

By the time the co_await happens at all, the potentially asynchronous process that generated the future is expected to have already started. Or at least, the co_await machinery makes it possible for it to have started.

Your proposed version loses out on the await_ready feature, which is what allows co_await to handle potentially-asynchronous processes. Between the time the future is generated and await_ready is called, the process may have finished. If it has, there is no need to schedule the resumption of the coroutine. It should therefore happen right here, on this thread.

If that minor stack inefficiency bothers you, then you would have to do things the way the Coroutine TS wants you to.

The general way to handle this is where coro_foo would directly execute async_foo and return a future-like object with a .then-like mechanism. Your problem is that async_foo itself doesn't have a .then-like mechanism, so you have to create one.

That means coro_foo must pass async_foo a functor that stores a coroutine_handle<>, one that can be updated by the future's continuation mechanism. Of course, you'll also need synchronization primitives. If the handle has been initialized by the time the functor has been executed, then the functor calls it, resuming the coroutine. If the functor completes without resuming a coroutine, the functor will set a variable to let the await machinery know that the value is ready.

Since the handle and this variable is shared between the await machinery and the functor, you'll need to ensure synchronization between the two. That's a fairly complex thing, but it's whatever .then-style machinery requires.

Or you could just live with the minor inefficiency.


Current design has an important future that co_await takes a general expression and not a call expression.

This allows us to write code like this:

auto f = coro_1();
co_await coro_2();
co_await f;

We can run two or more asynchronous tasks in-parallel, and then wait for both of them.

Consequently, the implementation of coro_1 should start its work in its call, and not in await_suspend.

This also means that there should be a pre-allocated memory where coro_1 would put its result, and where it would take the coroutine_handle.

We can use non-copyable Awaitable and guaranteed copy elision.
async_foo would be called from constructor of Awaitable:

auto coro_foo(A& a, B& b, C& c, X& x) /* -> Y */ {
  struct Awaitable {
    Awaitable(A& a, B& b, C& c, X& x) : x_(x) {
      async_foo(a, b, c, [this](X& x, Y& y){
        *x_ = std::move(x);
        y_ = &y;
        if (done_.exchange(true)) {
          h.resume();  // Coroutine resumes inside of resume()
        }
      });
    }
    bool await_ready() const noexcept {
      return done_;
    }
    bool await_suspend(coroutine_handle<> h) {
      h_ = h;
      return !done_.exchange(true);
    }
    Y await_resume() {
      return std::move(*y_);
    }
    atomic<bool> done_;
    coroutine_handle<> h_;
    X* x_;
    Y* y_;
  };
  return Awaitable(a, b, c, &x);
}