ByteBuffer.allocate() vs. ByteBuffer.allocateDirect()

Ron Hitches in his excellent book Java NIO seems to offer what I thought could be a good answer to your question:

Operating systems perform I/O operations on memory areas. These memory areas, as far as the operating system is concerned, are contiguous sequences of bytes. It's no surprise then that only byte buffers are eligible to participate in I/O operations. Also recall that the operating system will directly access the address space of the process, in this case the JVM process, to transfer the data. This means that memory areas that are targets of I/O perations must be contiguous sequences of bytes. In the JVM, an array of bytes may not be stored contiguously in memory, or the Garbage Collector could move it at any time. Arrays are objects in Java, and the way data is stored inside that object could vary from one JVM implementation to another.

For this reason, the notion of a direct buffer was introduced. Direct buffers are intended for interaction with channels and native I/O routines. They make a best effort to store the byte elements in a memory area that a channel can use for direct, or raw, access by using native code to tell the operating system to drain or fill the memory area directly.

Direct byte buffers are usually the best choice for I/O operations. By design, they support the most efficient I/O mechanism available to the JVM. Nondirect byte buffers can be passed to channels, but doing so may incur a performance penalty. It's usually not possible for a nondirect buffer to be the target of a native I/O operation. If you pass a nondirect ByteBuffer object to a channel for write, the channel may implicitly do the following on each call:

  1. Create a temporary direct ByteBuffer object.
  2. Copy the content of the nondirect buffer to the temporary buffer.
  3. Perform the low-level I/O operation using the temporary buffer.
  4. The temporary buffer object goes out of scope and is eventually garbage collected.

This can potentially result in buffer copying and object churn on every I/O, which are exactly the sorts of things we'd like to avoid. However, depending on the implementation, things may not be this bad. The runtime will likely cache and reuse direct buffers or perform other clever tricks to boost throughput. If you're simply creating a buffer for one-time use, the difference is not significant. On the other hand, if you will be using the buffer repeatedly in a high-performance scenario, you're better off allocating direct buffers and reusing them.

Direct buffers are optimal for I/O, but they may be more expensive to create than nondirect byte buffers. The memory used by direct buffers is allocated by calling through to native, operating system-specific code, bypassing the standard JVM heap. Setting up and tearing down direct buffers could be significantly more expensive than heap-resident buffers, depending on the host operating system and JVM implementation. The memory-storage areas of direct buffers are not subject to garbage collection because they are outside the standard JVM heap.

The performance tradeoffs of using direct versus nondirect buffers can vary widely by JVM, operating system, and code design. By allocating memory outside the heap, you may subject your application to additional forces of which the JVM is unaware. When bringing additional moving parts into play, make sure that you're achieving the desired effect. I recommend the old software maxim: first make it work, then make it fast. Don't worry too much about optimization up front; concentrate first on correctness. The JVM implementation may be able to perform buffer caching or other optimizations that will give you the performance you need without a lot of unnecessary effort on your part.


There is no reason to expect direct buffers to be faster for access inside the jvm. Their advantage comes when you pass them to native code -- such as, the code behind channels of all kinds.


since DirectByteBuffers are a direct memory mapping at OS level

They aren't. They are just normal application process memory, but not subject to relocation during Java GC which simplifies things inside the JNI layer considerably. What you describe applies to MappedByteBuffer.

that it would perform quicker with get/put calls

The conclusion doesn't follow from the premiss; the premiss is false; and the conclusion is also false. They are faster once you get inside the JNI layer, and if you are reading and writing from the same DirectByteBuffer they are much faster, because the data never has to cross the JNI boundary at all.


Best to do your own measurements. Quick answer seems to be that sending from an allocateDirect() buffer takes 25% to 75% less time than the allocate() variant (tested as copying a file to /dev/null), depending on size, but that the allocation itself can be significantly slower (even by a factor of 100x).

Sources:

  • Why the odd performance curve differential between ByteBuffer.allocate() and ByteBuffer.allocateDirect()

  • ByteBuffer.allocateDirect ridiculously slow

  • When to use Array, Buffer or direct Buffer