Parse a C++14 integer literal
x86 (32-bit) machine code, 59 57 bytes
This function takes esi
as a pointer to a null-terminated string and returns the value in edx
. (Listing below is GAS input in AT&T syntax.)
.globl parse_cxx14_int
.text
parse_cxx14_int:
push $10
pop %ecx # store 10 as base
xor %eax,%eax # initialize high bits of digit reader
cdq # also initialize result accumulator edx to 0
lodsb # fetch first character
cmp $'0', %al
jne .Lparseloop2
lodsb
and $~32, %al # uppercase letters (and as side effect,
# digits are translated to N+16)
jz .Lend # "0" string
cmp $'B', %al # after '0' have either digit, apostrophe,
# 'b'/'B' or 'x'/'X'
je .Lbin
jg .Lhex
dec %ecx
dec %ecx # update base to 8
jmp .Lprocessdigit # process octal digit that we just read (or
# skip ' if that is what we just read)
.Lbin:
sub $14, %ecx # with below will update base to 2
.Lhex:
add $6, %ecx # update base to 16
.Lparseloop:
lodsb # fetch next character
.Lparseloop2:
and $~32, %al # uppercase letters (and as side effect,
# digits are translated to N+16)
jz .Lend
.Lprocessdigit:
cmp $7, %al # skip ' (ASCII 39 which would have been
# translated to 7 above)
je .Lparseloop
test $64, %al # distinguish letters and numbers
jz .Lnum
sub $39, %al # with below will subtract 55 so e.g. 'A'==65
# will become 10
.Lnum:
sub $16, %al # translate digits to numerical value
imul %ecx, %edx
# movzbl %al, %eax
add %eax, %edx # accum = accum * base + newdigit
jmp .Lparseloop
.Lend:
ret
And a disassembly with byte counts - in Intel format this time, in case you prefer that one.
Disassembly of section .text:
00000000 <parse_cxx14_int>:
0: 6a 0a push 0xa
2: 59 pop ecx
3: 31 c0 xor eax,eax
5: 99 cdq
6: ac lods al,BYTE PTR ds:[esi]
7: 3c 30 cmp al,0x30
9: 75 16 jne 21 <parse_cxx14_int+0x21>
b: ac lods al,BYTE PTR ds:[esi]
c: 24 df and al,0xdf
e: 74 28 je 38 <parse_cxx14_int+0x38>
10: 3c 42 cmp al,0x42
12: 74 06 je 1a <parse_cxx14_int+0x1a>
14: 7f 07 jg 1d <parse_cxx14_int+0x1d>
16: 49 dec ecx
17: 49 dec ecx
18: eb 0b jmp 25 <parse_cxx14_int+0x25>
1a: 83 e9 0e sub ecx,0xe
1d: 83 c1 06 add ecx,0x6
20: ac lods al,BYTE PTR ds:[esi]
21: 24 df and al,0xdf
23: 74 13 je 38 <parse_cxx14_int+0x38>
25: 3c 07 cmp al,0x7
27: 74 f7 je 20 <parse_cxx14_int+0x20>
29: a8 40 test al,0x40
2b: 74 02 je 2f <parse_cxx14_int+0x2f>
2d: 2c 27 sub al,0x27
2f: 2c 10 sub al,0x10
31: 0f af d1 imul edx,ecx
34: 01 c2 add edx,eax
36: eb e8 jmp 20 <parse_cxx14_int+0x20>
38: c3 ret
And in case you want to try it, here is the C++ test driver code that I linked with it (including the calling convention specification in GCC asm syntax):
#include <cstdio>
#include <string>
#include <iostream>
inline int parse_cxx14_int_wrap(const char *s) {
int result;
const char* end;
__asm__("call parse_cxx14_int" :
"=d"(result), "=S"(end) :
"1"(s) :
"eax", "ecx", "cc");
return result;
}
int main(int argc, char* argv[]) {
std::string s;
while (std::getline(std::cin, s))
std::printf("%-16s -> %d\n", s.c_str(), parse_cxx14_int_wrap(s.c_str()));
return 0;
}
-1 byte due to comment by Peter Cordes
-1 byte from updating to use two decrements to change 10 to 8
JavaScript (Babel Node), 26 bytes
lol x2
_=>eval(_.split`'`.join``)
Try it online!
C++ (gcc), 141 138 134 120 bytes
This is a function that takes an array of characters (specified as a pair of pointers to the start and end - using the pair of iterators idiom) and returns the number. Note that the function mutates the input array.
(This does rely on the behavior of gcc/libstdc++ that #include<cstdlib>
also places the functions in global scope. For strictly standard compliant code, replace with #include<stdlib.h>
for a cost of one more character.)
Brief description: The code first uses std::remove
to filter out '
characters (ASCII 39). Then, strtol
with a base of 0 will already handle the decimal, octal, and hexadecimal cases, so the only other case to check for is a leading 0b
or 0B
and if so, set the base for strtol
to 2 and start parsing after the leading 2 characters.
#import<algorithm>
#import<cstdlib>
int f(char*s,char*e){e=s[*std::remove(s,e,39)=1]&31^2?s:s+2;return strtol(e,0,e-s);}
Try it online.
Saved 3 bytes due to suggestion by ceilingcat and some more golfing that followed.
Saved 4 bytes due to suggestions by grastropner.
-2 bytes by Lucas
-12 bytes by l4m2