How do I print an ASCII character by different code points in Bash?
Hex:
printf '\x4a'
Dec:
printf "\\$(printf %o 74)"
Alternative for hex :-)
xxd -r <<<'0 4a'
In general, the shell could understand hex, oct and decimal numbers in variables, provided they have been defined as integers
:
$ declare -i v1 v2 v3 v4 v5 v6 v7
$ v1=0112
$ v2=74
$ v3=0x4a
$ v4=8#112
$ v5=10#74
$ v6=16#4a
$ v7=18#gg
echo "$v1 $v2 $v3 $v4 $v5 $v6 $v7"
74 74 74 74 74 74 304
Or they are the result of an "Arithmetic Expansion":
$ : $(( v1=0112, v2=74, v3=0x4a, v4=8#112, v5=10#74, v6=16#4a, v7=18#gg ))
$ echo "$v1 $v2 $v3 $v4 $v5 $v6 $v7"
74 74 74 74 74 74 304
So, you just need one way to print the character that belongs to a variable value.
But here are two possible ways:
$ var=$((0x65))
$ printf '%b\n' "\\$(printf '0%o' "$var")"
e
$ declare -i var
$ var=0x65; printf '%b\n' "\U$(printf '%08x' "$var")"
e
The two printf are needed, one to transform the value into an hexadecimal string and the second to actually print the character.
The second one will print any UNICODE point (if your console is correctly set).
For example:
$ var=0x2603; printf '%b\n' "\U$(printf '%08x' "$var")"
☃
An snow man.
The character that has an utf-8 representation as f0 9f 90 ae
is 0x1F42E
.
Search for cow face site:fileformat.info
to get it:
$ var=0x1F42F; printf '%b\n' "\U$(printf '%08x' "$var")"
Note: There is a problem with the UNICODE way in that for bash before 4.3 (corrected in that version and upwards), the characters between UNICODE points 128 and 255 (in decimal) may be incorrectly printed.
References
Fourth paragraph inside PARAMETERS
in man bash
:
If the variable has its integer attribute set, then value is evaluated as an arithmetic expression even if the $((...)) expansion is not used (see Arithmetic Expansion below).
Inside "ARITHMETIC EVALUATION" in man bash
:
Constants with a leading 0 are interpreted as octal numbers. A leading 0x or 0X denotes hexadecimal. Otherwise, numbers take the form [base#]n, where the optional base is a decimal number between 2 and 64 representing the arithmetic base, and n is a number in that base. If base# is omitted, then base 10 is used. The digits greater than 9 are represented by the lowercase letters, the uppercase letters, @, and _, in that order. If base is less than or equal to 36, lowercase and uppercase letters may be used interchangeably to represent numbers between 10 and 35.
With zsh
:
$ printf '\x4a\n' # Hex
J
$ printf "\\$(([##8]74))\n" # Dec
J
To get a character (in the current charset) from the Unicode code point:
$ printf '\U1F42E\n' # Hex
$ printf "\\U$(([##16]128046))\n" # Dec
The #
parameter expansion flag combines both.
If $var
contains an arithmetic expression (such as 0x4a
, 74
, 0x60 + 10
) whose evaluation results in a number n
, then ${(#)var}
expands to the character whose byte value is n
if the locale uses a single-byte character set (such as ISO8859-15, KOI8-R...), or the character of Unicode codepoint n
in multi-byte character locales (or if n
> 255)¹.
$ x=0x4a d=74 u=0x1F42E
$ printf '%s\n' ${(#)x} ${(#)d} ${(#u)u}
J
J
$ (){printf '%s\n' ${(#)@}} $x $d $u
J
J
For completeness, to get from J
or back to
0x4a
or 0x1F42E
, there's the standard
$ c1='J' c2=''
$ printf '%#x\n' "'$c1" "'$c2"
0x4a
0x1f42e
And also:
$ echo $((#c1)) $((#c2)) $((##J)) $((##))
74 128046 74 128046
Or:
$ echo $(([#16] #c1)) $(([#16] #c2)) $(([##16] #c1)) $(([##16] #c2))
16#4A 16#1F42E 4A 1F42E
$ set -o cbases
$ echo $(([#16] #c1)) $(([#16] #c2))
0x4A 0x1F42E
¹ if the locale has no such character, zsh
falls back to outputting the byte n
& 0xff