Is there any way to create a primitive array without initialization?

Java 9 actually starts to expose this via jdk.internal.misc.Unsafe.allocateUninitializedArray method. It would actually require JDK.Unsupported module declaration.


I've done some investigation. There is no legal way to create uninitialized array in Java. Even JNI NewXxxArray creates initialized arrays. So it is impossible to know exactly the cost of array zeroing. Nevertheless I've done some measurements:

1) 1000 byte arrays creation with different array size

        long t0 = System.currentTimeMillis();
        for(int i = 0; i < 1000; i++) {
//          byte[] a1 = new byte[1];
            byte[] a1 = new byte[1000000];
        }
        System.out.println(System.currentTimeMillis() - t0);

on my PC it gives < 1ms for byte[1] and ~500 ms for byte[1000000]. Sounds impressive to me.

2) We don't have a fast (native) method in JDK for filling arrays, Arrays.fill is too slow, so let's see at least how much 1000 copying of 1,000,000 size array takes with native System.arraycopy

    byte[] a1 = new byte[1000000];
    byte[] a2 = new byte[1000000];
    for(int i = 0; i < 1000; i++) {
        System.arraycopy(a1, 0, a2, 0, 1000000);
    }

It is 700 ms.

It gives me reasons to believe that a) creating long arrays is expensive b) it seems to be expensive because of useless initialization.

3) Let's take sun.misc.Unsafe http://www.javasourcecode.org/html/open-source/jdk/jdk-6u23/sun/misc/Unsafe.html. It is protected from external usage but not too much

    Field f = Unsafe.class.getDeclaredField("theUnsafe");
    f.setAccessible(true);
    Unsafe unsafe = (Unsafe)f.get(null);

Here is the cost of memory allocation test

    for(int i = 0; i < 1000; i++) {
        long m = u.allocateMemory(1000000);
    }

It takes < 1 ms, if you remember, for new byte[1000000] it took 500ms.

4) Unsafe has no direct methods to work with arrays. It needs to know class fields, but reflection shows no fields in an array. There is not much info about arrays internals, I guess it is JVM / platform specific. Nevertheless, it is, like any other Java Object, header + fields. On my PC/JVM it looks like

header - 8 bytes
int length - 4 bytes
long bufferAddress - 8 bytes

Now, using Unsafe, I will create byte[10], allocate a 10 byte memory buffer and use it as my array's elements:

    byte[] a = new byte[10];
    System.out.println(Arrays.toString(a));
    long mem = unsafe.allocateMemory(10);
    unsafe.putLong(a, 12, mem);
    System.out.println(Arrays.toString(a));

it prints

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[8, 15, -114, 24, 0, 0, 0, 0, 0, 0]

You can see thay array's data are not initialized.

Now I'll change our array length (though it still points to 10 bytes memory)

    unsafe.putInt(a, 8, 1000000);
    System.out.println(a.length);

it shows 1000000. It was just to prove that the idea works.

Now performance test. I will create an empty byte array a1, allocate a buffer of 1000000 bytes, assign this buffer to a1 an set a1.length = 10000000

    long t0 = System.currentTimeMillis();
    for(int i = 0; i < 1000; i++) {
        byte[] a1 = new byte[0];
        long mem1 = unsafe.allocateMemory(1000000);
        unsafe.putLong(a1, 12, mem);
        unsafe.putInt(a1, 8, 1000000);
    }
    System.out.println(System.currentTimeMillis() - t0);

it takes 10ms.

5) There are malloc and alloc in C++, malloc just allocates memory block , calloc also initializes it with zeroes.

cpp

...
JNIEXPORT void JNICALL Java_Test_malloc(JNIEnv *env, jobject obj, jint n) {
     malloc(n);
} 

java

private native static void malloc(int n);

for (int i = 0; i < 500; i++) {
    malloc(1000000);
}

results malloc - 78 ms; calloc - 468 ms

Conclusions

  1. It seems that Java array creation is slow because of useless element zeroing.
  2. We cannot change it, but Oracle can. No need to change anything in JLS, just add native methods to java.lang.reflect.Array like

    public static native xxx[] newUninitialziedXxxArray(int size);

for all primitive numeric types (byte - double) and char type. It could be used all over the JDK, like in java.util.Arrays

    public static int[] copyOf(int[] original, int newLength) {
        int[] copy = Array.newUninitializedIntArray(newLength);
        System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength));
        ...

or java.lang.String

   public String concat(String str) {
        ...   
        char[] buf = Array.newUninitializedCharArray(count + otherLen);
        getChars(0, count, buf, 0);
        ...

I'm going to move this to an answer because it probably should be.

An "Array" in java is not what you think it is. It's not just a pointer to a chunk of contiguous memory on the stack or heap.

An Array in Java is an Object just like everything else (except primitives) and is on the heap. When you call new int[100000] you're creating a new object just like every other object, and it gets initialized, etc.

The JLS provides all the specific info about this:

http://docs.oracle.com/javase/specs/jls/se5.0/html/arrays.html

So, no. You can't avoid "initializing" an array. That's just not how Java works. There's simply no such thing as uninitialized heap memory; many people call that a "feature" as it prevents you from accessing uninitialized memory.

Tags:

Java

Arrays