regex unicode character in vim

This solution might not address the problem as originally stated, but it does address a different but very closely related one and I think it makes a lot of sense to place it here.

I don't know in which version of Vim it was implemented, but I was working on 7.4 when I tried it.

When in Edit mode, the sequence to output unicode characters is: ctrl-v u xxxx where xxxx is the code point. For instance outputting the euro sign would be ctrl-v u 20ac.

I tried it in Command mode as well and it worked. That is, to replace all instances of "20 euro" in my document with "20 €", I'd do:

:%s/20 euro/20 <ctrl-v u 20ac>/gc

In the above <ctrl-v u 20ac> is not literal, it's the sequence of keys that will output the character.


From :help regexp (lightly edited), you need to use some specific syntax to select unicode characters with a regular expression in Vim:

\%u match specified multibyte character (eg \%u20ac)

That is, to search for the unicode character with hex code 20AC, enter this into your search pattern:

\%u20ac

The full table of character search patterns includes some additional options:

\%d match specified decimal character (eg \%d123)
\%x match specified hex character (eg \%x2a)
\%o match specified octal character (eg \%o040)
\%u match specified multibyte character (eg \%u20ac)
\%U match specified large multibyte character (eg \%U12345678)

Tags:

Vim

Unicode

Regex