How can I reverse a string that contains combining characters in Perl?

You can use the \X special escape (match a non-combining character and all of the following combining characters) with split to make a list of graphemes (with empty strings between them), reverse the list of graphemes, then join them back together:

#!/usr/bin/perl

use strict;
use warnings;

my $original = "re\x{0301}sume\x{0301}";
my $wrong    = reverse $original;
my $right    = join '', reverse split /(\X)/, $original;
print "original: $original\n",
      "wrong:    $wrong\n",
      "right:    $right\n";

The best answer is to use Unicode::GCString, as Sinan points out


I modified Chas's example a bit:

  • Set the encoding on STDOUT to avoid "wide character in print" warnings;
  • Use a positive lookahead assertion (and no separator retention mode) in split (doesn't work after 5.10, apparently, so I removed it)

It's basically the same thing with a couple of tweaks.

use strict;
use warnings;

binmode STDOUT, ":utf8";

my $original = "re\x{0301}sume\x{0301}";
my $wrong    = reverse $original;
my $right    = join '', reverse split /(\X)/, $original;

print <<HERE;
original: [$original]
   wrong: [$wrong]
   right: [$right]
HERE