Integer.valueOf Arabic number works fine but Float.valueOf the same number gives NumberFormatException
It seems that Float.parseFloat()
does not support Eastern-Arabic numbers. Alternatively, you can use NumberFormat
class:
Locale EASTERN_ARABIC_NUMBERS_LOCALE = new Locale.Builder()
.setLanguage("ar")
.setExtension('u', "nu-arab")
.build();
float f = NumberFormat.getInstance(EASTERN_ARABIC_NUMBERS_LOCALE)
.parse("۱٫۵")
.floatValue();
System.out.println(f);
OUTPUT:
1.5
Answer
In Float.valueOf("۱")
there is no check for different languages or character, it only checks the digits 0-9
. Integer.valueOf
uses Character.digit() to get the value of each digit in the string.
Research/Explanation
I debugged the statement Float.valueOf("۱")
with Intellij debugger. If you dive into FloatingDecimal.java, it appears this code determines which character should be counted as a digit:
digitLoop:
while (i < len) {
c = in.charAt(i);
if (c >= '1' && c <= '9') {
digits[nDigits++] = c;
nTrailZero = 0;
} else if (c == '0') {
digits[nDigits++] = c;
nTrailZero++;
} else if (c == '.') {
if (decSeen) {
// already saw one ., this is the 2nd.
throw new NumberFormatException("multiple points");
}
decPt = i;
if (signSeen) {
decPt -= 1;
}
decSeen = true;
} else {
break digitLoop;
}
i++;
}
As you can see, there is no check for different languages, it only checks the digits 0-9
.
While stepping through Integer.valueOf
execution,
public static int parseInt(String s, int radix)
executes with s = "۱"
and radix = 10
.
The parseInt method then calls Character.digit('۱',10)
to get the digit value of 1
.
See Character.digit()
The specification of Float.valueOf(String)
says:
Leading and trailing whitespace characters in s are ignored. Whitespace is removed as if by the String.trim() method; that is, both ASCII space and control characters are removed. The rest of s should constitute a FloatValue as described by the lexical syntax rules:
FloatValue: Signopt NaN Signopt Infinity Signopt FloatingPointLiteral Signopt HexFloatingPointLiteral SignedInteger ...
The closest lexical rule to what you have is SignedInteger
, which consists of an optional sign, and then Digits
, which can only be 0-9
.
Digits:
Digit
Digit [DigitsAndUnderscores] Digit
Digit:
0
NonZeroDigit
NonZeroDigit:
(one of)
1 2 3 4 5 6 7 8 9
On the other hand, Integer.valueOf(String)
refer to Integer.parseInt(String)
, which simply says:
The characters in the string must all be decimal digits, except that the first character may be an ASCII minus sign
"Decimal digits" is broader than 0-9; anything in the DECIMAL_DIGIT_NUMBER
can be used, for example "१२३" (shameless plug).
More precisely, .
So, this is behaving as specified; whether you consider this to be a correct specification is another matter.