non-blocking IO vs async IO and implementation in Java
So what is actually "non-blocking async IO"?
To answer that, you must first understand that there's no such thing as blocking async I/O. The very concept of asynchronism dictates that there's no waiting, no blocking, no delay. When you see non-blocking asynchronous I/O, the non-blocking bit only serves to further qualify the async adjective in that term. So effectively, non-blocking async I/O might be a bit of a redundancy.
There are mainly two kinds of I/O. Synchronous and Asynchronous. Synchronous blocks the current thread of execution until processing is complete, while Asynchronous doesn't block the current thread of execution, rather passing control to the OS Kernel for further processing. The kernel then advises the async thread when the submitted task is complete
Asynchronous Channel Groups
The concept of Async Channels in java is backed by Asynchronous Channel Groups. An async channel group basically pools a number of channels for reuse. Consumers of the async api retrieve a channel from the group (the JVM creates one by default) and the channel automatically puts itself back into the group after it's completed its read/write operation. Ultimately, Async Channel Groups are backed by surprise, threadpools. Also, Asynchronous channels are threadsafe.
The size of the threadpool that backs an async channel group is configured by the following JVM property
java.nio.channels.DefaultThreadPool.initialSize
which, given an integer value will setup a threadpool of that size, to back the channel group. The channel group is created and maintained transparently to the developer otherwise.
And how all them can be implemented in Java
Well, I'm glad you asked. Here's an example of an AsynchronousSocketChannel
(used to open a non-blocking client Socket
to a listening server.) This sample is an excerpt from Apress Pro Java NIO.2, commented by me:
//Create an Asynchronous channel. No connection has actually been established yet
AsynchronousSocketChannel asynchronousSocketChannel = AsynchronousSocketChannel.open();
/**Connect to an actual server on the given port and address.
The operation returns a type of Future, the basis of the all
asynchronous operations in java. In this case, a Void is
returned because nothing is returned after a successful socket connection
*/
Void connect = asynchronousSocketChannel.connect(new InetSocketAddress("127.0.0.1", 5000)).get();
//Allocate data structures to use to communicate over the wire
ByteBuffer helloBuffer = ByteBuffer.wrap("Hello !".getBytes());
//Send the message
Future<Integer> successfullyWritten= asynchronousSocketChannel.write(helloBuffer);
//Do some stuff here. The point here is that asynchronousSocketChannel.write()
//returns almost immediately, not waiting to actually finish writing
//the hello to the channel before returning control to the currently executing thread
doSomethingElse();
//now you can come back and check if it was all written (or not)
System.out.println("Bytes written "+successfullyWritten.get());
EDIT: I should mention that support for Async NIO came in JDK 1.7
Non blocking IO is when the call to perform IO returns immediately, and does not block your thread.
The only way to know if the IO is done, is to poll its status or block. Think of it as a Future
. You start an IO operation, and it returns you a Future
. You can call isDone()
on it to check if its done, if it is, do what you want with it, otherwise keep doing other stuff until the next time you want to check if its done. Or, if you're out of things to do, you can call get
on it, which will block until its done.
Async IO is when the call to perform IO notifies you it is done through an event, not through its return value.
This can be blocking or non-blocking.
Blocking Async IO
What is meant by blocking async IO is that the call to perform IO is a normal blocking call, but the thing you called wrapped that call inside a thread which will block until the IO is done and then delegate the handling of the result of the IO to your callback. That is, there is still a thread lower down the stack which is blocked on the IO, but your thread isn't.
Non-blocking Async IO
This is actually the more common one, and it means that the non-blocking IO does not need to be polled for its status, as with standard non-blocking IO, instead it will call your callback when its done. As opposed to blocking async IO, this one has no threads blocked anywhere down the stack, thus its faster and uses less resources, as the asynchronous behavior is managed without blocking threads.
You can think of it as a CompletableFuture
. It requires that your program has some form of async event framework, which can be multi-threaded or not. So its possible the callback is executed in another thread, or that it is scheduled for execution on an existing thread once the current task is done.
I explain the distinction more thoroughly here.
I see this is an old question, but I think something was missed here, that @nickdu attempted to point out but wasn't quite clear.
There are four types of IO pertinent to this discussion:
Blocking IO
Non-Blocking IO
Asynchronous IO
Asynchronous Non-Blocking IO
The confusion arises I think because of ambiguous definitions. So let me attempt to clarify that.
First Let's talk about IO. When we have slow IO this is most apparent, but IO operations can either be blocking or non-blocking. This has nothing to do with threads, it has to do with the interface to the operating system. When I ask the OS for an IO operation I have the choice of waiting for all the data to be ready (blocking), or getting what is available right now and moving on (non-blocking). The default is blocking IO. It is much easier to write code using blocking IO as the path is much clearer. However, your code has to stop and wait for IO to complete. Non-Blocking IO requires interfacing with the IO libraries at a lower level, using select and read/write instead of the higher level libraries that provide convenient operations. Non-Blocking IO also implies that you have something you need to work on while the OS works on doing the IO. This might be multiple IO operations or computation on the IO that has completed.
Blocking IO - The application waits for the OS to gather all the bytes to complete the operation or reach the end before continuing. This is default. To be more clear for the very technical, the system call that initiates the IO will install a signal handler waiting for a processor interrupt that will occur when the IO operation makes progress. Then the system call will begin a sleep which suspends operation of the current process for a period of time, or until the process interrupt occurs.
Non-Blocking IO - The application tells the OS it only wants what bytes are available right now, and moves on while the OS concurrently gathers more bytes. The code uses select to determine what IO operations have bytes available. In this case the system call will again install a signal handler, but rather than sleep, it will associate the signal handler with the file handle, and immediately return. The process will become responsible for periodically checking the file handle for the interrupt flag having been set. This is usually done with a select call.
Now Asynchronous is where the confusion begins. The general concept of asynchronous only implies that the process continues while the background operation is performed, the mechanism by which this occurs is not specific. The term is ambiguous as both non-blocking IO and threaded blocking IO can be considered to be asynchronous. Both allow concurrent operations, however the resource requirements are different, and the code is substantially different. Because you have asked a question "What is Non-Blocking Asynchronous IO", I am going to use a stricter definition for asynchronous, a threaded system performing IO which may or may not be non-blocking.
The general definition
Asynchronous IO - Programmatic IO which allows multiple concurrent IO operations to occur. IO operations are happening simultaneously, so that code is not waiting for data that is not ready.
The stricter definition
Asynchronous IO - Programmatic IO which uses threading or multiprocessing to allow concurrent IO operations to occur.
Now with those clearer definitions we have the following four types of IO paradigms.
Blocking IO - Standard single threaded IO in which the application waits for all IO operations to complete before moving on. Easy to code, no concurrency and so slow for applications that require multiple IO operations. The process or thread will sleep while waiting for the IO interrupt to occur.
Asynchronous IO - Threaded IO in which the application uses threads of execution to perform Blocking IO operations concurrently. Requires thread safe code, but is generally easier to read and write than the alternative. Gains the overhead of multiple threads, but has clear execution paths. May require the use of synchronized methods and containers.
Non-Blocking IO - Single threaded IO in which the application uses select to determine which IO operations are ready to advance, allowing the execution of other code or other IO operations while the OS processes concurrent IO. The process does not sleep while waiting for the IO interrupt, but takes on the responsibility to check for the IO flag on the filehandle. Much more complicated code due to the need to check the IO flag with select, though does not require thread-safe code or synchronized methods and containers. Low execution over-head at the expense of code complexity. Execution paths are convoluted.
Asynchronous Non-Blocking IO - A hybrid approach to IO aimed at reducing complexity by using threads, while maintaining scalability by using non-blocking IO operations where possible. This would be the most complex type of IO requiring synchronized methods and containers, as well as convoluted execution paths. This is not the type of IO that one should consider coding lightly, and is most often only used when using a library that will mask the complexity, something like Futures and Promises.