Using async / await with DataReader ? ( without middle buffers!)
I want to do Asynchronous I/O calls (using async await) - but :
- Without using DataFlow dependency ( like in this answer)
- Without middle buffers( not like this answer)
- The Projector function should be sent as an argument. ( not like this answer)
You may want to check Stephen Toub's "Tasks, Monads, and LINQ" for some great ideas on how to process asynchronous data sequences.
It's not (yet) possible to combine yield
and await
, but I'm going to be a verbalist here: the quoted requirements didn't list IEnumerable
and LINQ. So, here's a possible solution shaped as two coroutines (almost untested).
Data producer routine (corresponds to IEnumarable
with yield
):
public async Task GetSomeDataAsync<T>(
string sql, Func<IDataRecord, T> projector, ProducerConsumerHub<T> hub)
{
using (SqlConnection _conn = new SqlConnection(@"Data Source=..."))
{
using (SqlCommand _cmd = new SqlCommand(sql, _conn))
{
await _conn.OpenAsync();
_cmd.CommandTimeout = 100000;
using (var rdr = await _cmd.ExecuteReaderAsync())
{
while (await rdr.ReadAsync())
await hub.ProduceAsync(projector(rdr));
}
}
}
}
Data consumer routine (correspond to foreach
or a LINQ expression):
public async Task ConsumeSomeDataAsync(string sql)
{
var hub = new ProducerConsumerHub<IDataRecord>();
var producerTask = GetSomeDataAsync(sql, rdr => rdr, hub);
while (true)
{
var nextItemTask = hub.ConsumeAsync();
await Task.WhenAny(producerTask, nextItemTask);
if (nextItemTask.IsCompleted)
{
// process the next data item
Console.WriteLine(await nextItemTask);
}
if (producerTask.IsCompleted)
{
// process the end of sequence
await producerTask;
break;
}
}
}
Coroutine execution helper (can also be implemented as a pair of custom awaiters):
public class ProducerConsumerHub<T>
{
TaskCompletionSource<Empty> _consumer = new TaskCompletionSource<Empty>();
TaskCompletionSource<T> _producer = new TaskCompletionSource<T>();
// TODO: make thread-safe
public async Task ProduceAsync(T data)
{
_producer.SetResult(data);
await _consumer.Task;
_consumer = new TaskCompletionSource<Empty>();
}
public async Task<T> ConsumeAsync()
{
var data = await _producer.Task;
_producer = new TaskCompletionSource<T>();
_consumer.SetResult(Empty.Value);
return data;
}
struct Empty { public static readonly Empty Value = default(Empty); }
}
This is just an idea. It might be an overkill for a simple task like this, and it could be improved in some areas (like thread-safety, race conditions and handling the end of the sequence without touching producerTask
). Yet it illustrates how the asynchronous data retrieval and processing could possibly be decoupled.
This Medium article describes another solution, which is to use the Dasync/AsyncEnumerable
library.
The library is open source, available on NuGet and GitHub, and provides a readable syntax to use now, for IAsyncEnumerable
, until C# 8.0 comes out and provides its own implementation and language support in the form of async ... yield return
and await foreach
.
(I have no connection with the library; I came across it as a possible very useful solution to - what I think is! - the same problem as yours, on a project I'm developing.)