Way to get number of digits in an int?
Two comments on your benchmark: Java is a complex environment, what with just-in-time compiling and garbage collection and so forth, so to get a fair comparison, whenever I run a benchmark, I always: (a) enclose the two tests in a loop that runs them in sequence 5 or 10 times. Quite often the runtime on the second pass through the loop is quite different from the first. And (b) After each "approach", I do a System.gc() to try to trigger a garbage collection. Otherwise, the first approach might generate a bunch of objects, but not quite enough to force a garbage collection, then the second approach creates a few objects, the heap is exhausted, and garbage collection runs. Then the second approach is "charged" for picking up the garbage left by the first approach. Very unfair!
That said, neither of the above made a significant difference in this example.
With or without those modifications, I got very different results than you did. When I ran this, yes, the toString approach gave run times of 6400 to 6600 millis, while the log approach topok 20,000 to 20,400 millis. Instead of being slightly faster, the log approach was 3 times slower for me.
Note that the two approaches involve very different costs, so this isn't totally shocking: The toString approach will create a lot of temporary objects that have to be cleaned up, while the log approach takes more intense computation. So maybe the difference is that on a machine with less memory, toString requires more garbage collection rounds, while on a machine with a slower processor, the extra computation of log would be more painful.
I also tried a third approach. I wrote this little function:
static int numlength(int n)
{
if (n == 0) return 1;
int l;
n=Math.abs(n);
for (l=0;n>0;++l)
n/=10;
return l;
}
That ran in 1600 to 1900 millis -- less than 1/3 of the toString approach, and 1/10 the log approach on my machine.
If you had a broad range of numbers, you could speed it up further by starting out dividing by 1,000 or 1,000,000 to reduce the number of times through the loop. I haven't played with that.
The fastest approach: divide and conquer.
Assuming your range is 0 to MAX_INT, then you have 1 to 10 digits. You can approach this interval using divide and conquer, with up to 4 comparisons per each input. First, you divide [1..10] into [1..5] and [6..10] with one comparison, and then each length 5 interval you divide using one comparison into one length 3 and one length 2 interval. The length 2 interval requires one more comparison (total 3 comparisons), the length 3 interval can be divided into length 1 interval (solution) and a length 2 interval. So, you need 3 or 4 comparisons.
No divisions, no floating point operations, no expensive logarithms, only integer comparisons.
Code (long but fast):
if (n < 100000) {
// 5 or less
if (n < 100){
// 1 or 2
if (n < 10)
return 1;
else
return 2;
} else {
// 3 or 4 or 5
if (n < 1000)
return 3;
else {
// 4 or 5
if (n < 10000)
return 4;
else
return 5;
}
}
} else {
// 6 or more
if (n < 10000000) {
// 6 or 7
if (n < 1000000)
return 6;
else
return 7;
} else {
// 8 to 10
if (n < 100000000)
return 8;
else {
// 9 or 10
if (n < 1000000000)
return 9;
else
return 10;
}
}
}
Benchmark (after JVM warm-up) - see code below to see how the benchmark was run:
- baseline method (with String.length): 2145ms
- log10 method: 711ms = 3.02 times as fast as baseline
- repeated divide: 2797ms = 0.77 times as fast as baseline
- divide-and-conquer: 74ms = 28.99
times as fast as baseline
Full code:
public static void main(String[] args) throws Exception {
// validate methods:
for (int i = 0; i < 1000; i++)
if (method1(i) != method2(i))
System.out.println(i);
for (int i = 0; i < 1000; i++)
if (method1(i) != method3(i))
System.out.println(i + " " + method1(i) + " " + method3(i));
for (int i = 333; i < 2000000000; i += 1000)
if (method1(i) != method3(i))
System.out.println(i + " " + method1(i) + " " + method3(i));
for (int i = 0; i < 1000; i++)
if (method1(i) != method4(i))
System.out.println(i + " " + method1(i) + " " + method4(i));
for (int i = 333; i < 2000000000; i += 1000)
if (method1(i) != method4(i))
System.out.println(i + " " + method1(i) + " " + method4(i));
// work-up the JVM - make sure everything will be run in hot-spot mode
allMethod1();
allMethod2();
allMethod3();
allMethod4();
// run benchmark
Chronometer c;
c = new Chronometer(true);
allMethod1();
c.stop();
long baseline = c.getValue();
System.out.println(c);
c = new Chronometer(true);
allMethod2();
c.stop();
System.out.println(c + " = " + StringTools.formatDouble((double)baseline / c.getValue() , "0.00") + " times as fast as baseline");
c = new Chronometer(true);
allMethod3();
c.stop();
System.out.println(c + " = " + StringTools.formatDouble((double)baseline / c.getValue() , "0.00") + " times as fast as baseline");
c = new Chronometer(true);
allMethod4();
c.stop();
System.out.println(c + " = " + StringTools.formatDouble((double)baseline / c.getValue() , "0.00") + " times as fast as baseline");
}
private static int method1(int n) {
return Integer.toString(n).length();
}
private static int method2(int n) {
if (n == 0)
return 1;
return (int)(Math.log10(n) + 1);
}
private static int method3(int n) {
if (n == 0)
return 1;
int l;
for (l = 0 ; n > 0 ;++l)
n /= 10;
return l;
}
private static int method4(int n) {
if (n < 100000) {
// 5 or less
if (n < 100) {
// 1 or 2
if (n < 10)
return 1;
else
return 2;
} else {
// 3 or 4 or 5
if (n < 1000)
return 3;
else {
// 4 or 5
if (n < 10000)
return 4;
else
return 5;
}
}
} else {
// 6 or more
if (n < 10000000) {
// 6 or 7
if (n < 1000000)
return 6;
else
return 7;
} else {
// 8 to 10
if (n < 100000000)
return 8;
else {
// 9 or 10
if (n < 1000000000)
return 9;
else
return 10;
}
}
}
}
private static int allMethod1() {
int x = 0;
for (int i = 0; i < 1000; i++)
x = method1(i);
for (int i = 1000; i < 100000; i += 10)
x = method1(i);
for (int i = 100000; i < 1000000; i += 100)
x = method1(i);
for (int i = 1000000; i < 2000000000; i += 200)
x = method1(i);
return x;
}
private static int allMethod2() {
int x = 0;
for (int i = 0; i < 1000; i++)
x = method2(i);
for (int i = 1000; i < 100000; i += 10)
x = method2(i);
for (int i = 100000; i < 1000000; i += 100)
x = method2(i);
for (int i = 1000000; i < 2000000000; i += 200)
x = method2(i);
return x;
}
private static int allMethod3() {
int x = 0;
for (int i = 0; i < 1000; i++)
x = method3(i);
for (int i = 1000; i < 100000; i += 10)
x = method3(i);
for (int i = 100000; i < 1000000; i += 100)
x = method3(i);
for (int i = 1000000; i < 2000000000; i += 200)
x = method3(i);
return x;
}
private static int allMethod4() {
int x = 0;
for (int i = 0; i < 1000; i++)
x = method4(i);
for (int i = 1000; i < 100000; i += 10)
x = method4(i);
for (int i = 100000; i < 1000000; i += 100)
x = method4(i);
for (int i = 1000000; i < 2000000000; i += 200)
x = method4(i);
return x;
}
Again, benchmark:
- baseline method (with String.length): 2145ms
- log10 method: 711ms = 3.02 times as fast as baseline
- repeated divide: 2797ms = 0.77 times as fast as baseline
- divide-and-conquer: 74ms = 28.99 times as fast as baseline
Edit
After I wrote the benchmark, I took a sneak peak into Integer.toString from Java 6, and I found that it uses:
final static int [] sizeTable = { 9, 99, 999, 9999, 99999, 999999, 9999999,
99999999, 999999999, Integer.MAX_VALUE };
// Requires positive x
static int stringSize(int x) {
for (int i=0; ; i++)
if (x <= sizeTable[i])
return i+1;
}
I benchmarked it against my divide-and-conquer solution:
- divide-and-conquer: 104ms
- Java 6 solution - iterate and compare: 406ms
Mine is about 4x as fast as the Java 6 solution.
Your String-based solution is perfectly OK, there is nothing "un-neat" about it. You have to realize that mathematically, numbers don't have a length, nor do they have digits. Length and digits are both properties of a physical representation of a number in a specific base, i.e. a String.
A logarithm-based solution does (some of) the same things the String-based one does internally, and probably does so (insignificantly) faster because it only produces the length and ignores the digits. But I wouldn't actually consider it clearer in intent - and that's the most important factor.
The logarithm is your friend:
int n = 1000;
int length = (int)(Math.log10(n)+1);
NB: only valid for n > 0.