In java, is it more efficient to use byte or short instead of int and float instead of double?
Am I wrong in assuming it should be faster and more efficient? I'd hate to go through and change everything in a massive program to find out I wasted my time.
Short answer
Yes, you are wrong. In most cases, it makes little difference in terms of space used.
It is not worth trying to optimize this ... unless you have clear evidence that optimization is needed. And if you do need to optimize memory usage of object fields in particular, you will probably need to take other (more effective) measures.
Longer answer
The Java Virtual Machine models stacks and object fields using offsets that are (in effect) multiples of a 32 bit primitive cell size. So when you declare a local variable or object field as (say) a byte
, the variable / field will be stored in a 32 bit cell, just like an int
.
There are two exceptions to this:
long
anddouble
values require 2 primitive 32-bit cells- arrays of primitive types are represent in packed form, so that (for example) an array of bytes hold 4 bytes per 32bit word.
So it might be worth optimizing use of long
and double
... and large arrays of primitives. But in general no.
In theory, a JIT might be able to optimize this, but in practice I've never heard of a JIT that does. One impediment is that the JIT typically cannot run until after there instances of the class being compiled have been created. If the JIT optimized the memory layout, you could have two (or more) "flavors" of object of the same class ... and that would present huge difficulties.
Revisitation
Looking at the benchmark results in @meriton's answer, it appears that using short
and byte
instead of int
incurs a performance penalty for multiplication. Indeed, if you consider the operations in isolation, the penalty is significant. (You shouldn't consider them in isolation ... but that's another topic.)
I think the explanation is that JIT is probably doing the multiplications using 32bit multiply instructions in each case. But in the byte
and short
case, it executes extra instructions to convert the intermediate 32 bit value to a byte
or short
in each loop iteration. (In theory, that conversion could be done once at the end of the loop ... but I doubt that the optimizer would be able to figure that out.)
Anyway, this does point to another problem with switching to short
and byte
as an optimization. It could make performance worse ... in an algorithm that is arithmetic and compute intensive.
That depends on the implementation of the JVM, as well as the underlying hardware. Most modern hardware will not fetch single bytes from memory (or even from the first level cache), i.e. using the smaller primitive types generally does not reduce memory bandwidth consumption. Likewise, modern CPU have a word size of 64 bits. They can perform operations on less bits, but that works by discarding the extra bits, which isn't faster either.
The only benefit is that smaller primitive types can result in a more compact memory layout, most notably when using arrays. This saves memory, which can improve locality of reference (thus reducing the number of cache misses) and reduce garbage collection overhead.
Generally speaking however, using the smaller primitive types is not faster.
To demonstrate that, behold the following benchmark:
package tools.bench;
import java.math.BigDecimal;
public abstract class Benchmark {
final String name;
public Benchmark(String name) {
this.name = name;
}
abstract int run(int iterations) throws Throwable;
private BigDecimal time() {
try {
int nextI = 1;
int i;
long duration;
do {
i = nextI;
long start = System.nanoTime();
run(i);
duration = System.nanoTime() - start;
nextI = (i << 1) | 1;
} while (duration < 100000000 && nextI > 0);
return new BigDecimal((duration) * 1000 / i).movePointLeft(3);
} catch (Throwable e) {
throw new RuntimeException(e);
}
}
@Override
public String toString() {
return name + "\t" + time() + " ns";
}
public static void main(String[] args) throws Exception {
Benchmark[] benchmarks = {
new Benchmark("int multiplication") {
@Override int run(int iterations) throws Throwable {
int x = 1;
for (int i = 0; i < iterations; i++) {
x *= 3;
}
return x;
}
},
new Benchmark("short multiplication") {
@Override int run(int iterations) throws Throwable {
short x = 0;
for (int i = 0; i < iterations; i++) {
x *= 3;
}
return x;
}
},
new Benchmark("byte multiplication") {
@Override int run(int iterations) throws Throwable {
byte x = 0;
for (int i = 0; i < iterations; i++) {
x *= 3;
}
return x;
}
},
new Benchmark("int[] traversal") {
@Override int run(int iterations) throws Throwable {
int[] x = new int[iterations];
for (int i = 0; i < iterations; i++) {
x[i] = i;
}
return x[x[0]];
}
},
new Benchmark("short[] traversal") {
@Override int run(int iterations) throws Throwable {
short[] x = new short[iterations];
for (int i = 0; i < iterations; i++) {
x[i] = (short) i;
}
return x[x[0]];
}
},
new Benchmark("byte[] traversal") {
@Override int run(int iterations) throws Throwable {
byte[] x = new byte[iterations];
for (int i = 0; i < iterations; i++) {
x[i] = (byte) i;
}
return x[x[0]];
}
},
};
for (Benchmark bm : benchmarks) {
System.out.println(bm);
}
}
}
which prints on my somewhat old notebook (adding spaces to adjust columns):
int multiplication 1.530 ns
short multiplication 2.105 ns
byte multiplication 2.483 ns
int[] traversal 5.347 ns
short[] traversal 4.760 ns
byte[] traversal 2.064 ns
As you can see, the performance differences are quite minor. Optimizing algorithms is far more important than the choice of primitive type.
Using byte
instead of int
can increase performance if you are using them in a huge amount. Here is an experiment:
import java.lang.management.*;
public class SpeedTest {
/** Get CPU time in nanoseconds. */
public static long getCpuTime() {
ThreadMXBean bean = ManagementFactory.getThreadMXBean();
return bean.isCurrentThreadCpuTimeSupported() ? bean
.getCurrentThreadCpuTime() : 0L;
}
public static void main(String[] args) {
long durationTotal = 0;
int numberOfTests=0;
for (int j = 1; j < 51; j++) {
long beforeTask = getCpuTime();
// MEASURES THIS AREA------------------------------------------
long x = 20000000;// 20 millions
for (long i = 0; i < x; i++) {
TestClass s = new TestClass();
}
// MEASURES THIS AREA------------------------------------------
long duration = getCpuTime() - beforeTask;
System.out.println("TEST " + j + ": duration = " + duration + "ns = "
+ (int) duration / 1000000);
durationTotal += duration;
numberOfTests++;
}
double average = durationTotal/numberOfTests;
System.out.println("-----------------------------------");
System.out.println("Average Duration = " + average + " ns = "
+ (int)average / 1000000 +" ms (Approximately)");
}
}
This class tests the speed of creating a new TestClass
. Each tests does it 20 million times and there are 50 tests.
Here is the TestClass:
public class TestClass {
int a1= 5;
int a2= 5;
int a3= 5;
int a4= 5;
int a5= 5;
int a6= 5;
int a7= 5;
int a8= 5;
int a9= 5;
int a10= 5;
int a11= 5;
int a12=5;
int a13= 5;
int a14= 5;
}
I've run the SpeedTest
class and in the end got this:
Average Duration = 8.9625E8 ns = 896 ms (Approximately)
Now I'm changing the ints into bytes in the TestClass and running it again. Here is the result:
Average Duration = 6.94375E8 ns = 694 ms (Approximately)
I believe this experiment shows that if you are instancing a huge amount of variables, using byte instead of int can increase efficiency