How to increment letters like numbers in PHP?
To increment or decrement in the 7bits 128 chars ASCII range, the safest:
$CHAR = "l";
echo chr(ord($CHAR)+1)." ".chr(ord($CHAR)-1);
/* m k */
So, it is normal to get a backtick by decrementing a
, as the ascii spec list
Print the whole ascii range:
for ($i = 0;$i < 127;$i++){
echo chr($i);
}
/* !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ */
More infos about ANSI 7 bits ASCII: man ascii
To increment or decrement in the 8-bits extended 256 chars UTF-8 range.
This is where it starts to differ regarding the host machine charset. but those charsets are all available on modern machines. From php, the safest is to use the php-mbstring
extension: https://www.php.net/manual/en/function.mb-chr.php
Extended ASCII (EASCII or high ASCII) character encodings are eight-bit or larger encodings that include the standard seven-bit ASCII characters, plus additional characters. https://en.wikipedia.org/wiki/Extended_ASCII
More info, as example: man iso_8859-9
ISO 8859-1 West European languages (Latin-1)
ISO 8859-2 Central and East European languages (Latin-2)
ISO 8859-3 Southeast European and miscellaneous languages (Latin-3)
ISO 8859-4 Scandinavian/Baltic languages (Latin-4)
ISO 8859-5 Latin/Cyrillic
ISO 8859-6 Latin/Arabic
ISO 8859-7 Latin/Greek
ISO 8859-8 Latin/Hebrew
ISO 8859-9 Latin-1 modification for Turkish (Latin-5)
ISO 8859-10 Lappish/Nordic/Eskimo languages (Latin-6)
ISO 8859-11 Latin/Thai
ISO 8859-13 Baltic Rim languages (Latin-7)
ISO 8859-14 Celtic (Latin-8)
ISO 8859-15 West European languages (Latin-9)
ISO 8859-16 Romanian (Latin-10)
Example, we can find the €
symbol in ISO 8859-7:
244 164 A4 € EURO SIGN
To increment or decrement in the 16 bits UTF-16 Unicode range:
Here is a way to generate the whole unicode charset, by generating html entities and converting to utf8. Run it online
for ($x = 0; $x < 262144; $x++){
echo html_entity_decode("&#".$x.";",ENT_NOQUOTES,"UTF-8");
}
Same stuff, but the range goes up to (16^4 * 4)
!
echo html_entity_decode('!',ENT_NOQUOTES,'UTF-8');
/* ! */
echo html_entity_decode('"',ENT_NOQUOTES,'UTF-8');
/* " */
To retrieve the unicode €
symbol,using the base10 decimal representation of the character.
echo html_entity_decode('€',ENT_NOQUOTES,'UTF-8');
/* € */
The same symbol, using the base16 hexadecimal representation:
echo html_entity_decode('&#'.hexdec("20AC").';',ENT_NOQUOTES,'UTF-8');
/* € */
First 32 bits are reserved for special control characters, output garbage �����, but have a meaning.
Character/string increment works in PHP (though decrement doesn't)
$x = 'AAZ';
$x++;
echo $x; // 'ABA'
You can do it with the ++ operator.
$i = 'aaz';
$i++;
print $i;
aba
However this implementation has some strange things:
for($i = 'a'; $i < 'z'; $i++) print "$i ";
This will print out letters from a
to y
.
for($i = 'a'; $i <= 'z'; $i++) print "$i ";
This will print out lettes from a
to z
and it continues with aa
and ends with yz
.
As proposed in PHP RFC: Strict operators directive (currently Under Discussion):
Using the increment function on a string will throw a TypeError when strict_operators is enabled.
Whether or not the RFC gets merged, PHP will sooner or later go that direction of adding operator strictness. Therefore, you should not be incrementing strings.
a-z/A-Z ranges
If you know your letters will stay in range a-z/A-Z (not surpass z/Z), you can use the solution that converts letter to ASCII code, increments it, and converts back to letter.
Use ord()
a chr()
:
$letter = 'A';
$letterAscii = ord($letter);
$letterAscii++;
$letter = chr($letterAscii); // 'B'
ord()
converts the letter into ASCII num representation- that num representation is incremented
- using
chr()
the number gets converted back to the letter
As discovered in comments, be careful. This iterates ASCII table so from Z
(ASCII 90), it does not go to AA
, but to [
(ASCII 91).
Going beyond z/Z
If you dare to go further and want z
became aa
, this is what I came up with:
final class NextLetter
{
private const ASCII_UPPER_CASE_BOUNDARIES = [65, 91];
private const ASCII_LOWER_CASE_BOUNDARIES = [97, 123];
public static function get(string $previous) : string
{
$letters = str_split($previous);
$output = '';
$increase = true;
while (! empty($letters)) {
$letter = array_pop($letters);
if ($increase) {
$letterAscii = ord($letter);
$letterAscii++;
if ($letterAscii === self::ASCII_UPPER_CASE_BOUNDARIES[1]) {
$letterAscii = self::ASCII_UPPER_CASE_BOUNDARIES[0];
$increase = true;
} elseif ($letterAscii === self::ASCII_LOWER_CASE_BOUNDARIES[1]) {
$letterAscii = self::ASCII_LOWER_CASE_BOUNDARIES[0];
$increase = true;
} else {
$increase = false;
}
$letter = chr($letterAscii);
if ($increase && empty($letters)) {
$letter .= $letter;
}
}
$output = $letter . $output;
}
return $output;
}
}
I'm giving you also 100% coverage if you intend to work with it further. It tests against original string incrementation ++
:
/**
* @dataProvider letterProvider
*/
public function testIncrementLetter(string $givenLetter) : void
{
$expectedValue = $givenLetter;
self::assertSame(++$expectedValue, NextLetter::get($givenLetter));
}
/**
* @return iterable<array<string>>
*/
public function letterProvider() : iterable
{
yield ['A'];
yield ['a'];
yield ['z'];
yield ['Z'];
yield ['aaz'];
yield ['aaZ'];
yield ['abz'];
yield ['abZ'];
}