regex unicode character in vim
This solution might not address the problem as originally stated, but it does address a different but very closely related one and I think it makes a lot of sense to place it here.
I don't know in which version of Vim it was implemented, but I was working on 7.4 when I tried it.
When in Edit mode, the sequence to output unicode characters is: ctrl-v
u
xxxx
where xxxx
is the code point. For instance outputting the euro sign would be ctrl-v
u
20ac
.
I tried it in Command mode as well and it worked. That is, to replace all instances of "20 euro" in my document with "20 €", I'd do:
:%s/20 euro/20 <ctrl-v u 20ac>/gc
In the above <ctrl-v u 20ac>
is not literal, it's the sequence of keys that will output the €
character.
From :help regexp
(lightly edited), you need to use some specific syntax to select unicode characters with a regular expression in Vim:
\%u match specified multibyte character (eg \%u20ac)
That is, to search for the unicode character with hex code 20AC, enter this into your search pattern:
\%u20ac
The full table of character search patterns includes some additional options:
\%d match specified decimal character (eg \%d123)
\%x match specified hex character (eg \%x2a)
\%o match specified octal character (eg \%o040)
\%u match specified multibyte character (eg \%u20ac)
\%U match specified large multibyte character (eg \%U12345678)