Romanization of Unicode text

You can use Unidecode Sharp :

[a C#] port from Python Unidecode that itself port from Perl unidecode. (there are also PHP and Ruby implementations available)


using BinaryAnalysis.UnidecodeSharp;


string _Greek="Αλφαβητικός";

string _Japan ="しんばし";

string _Russian ="яйца Фаберже";

I hope, it will be good for you.

The problem is a lot more complex than you think.

Greek, Cyrillic, Indic scripts, Georgian -> trivial, you could program that in an hour
Thai, Japanese Kana -> doable with a bit more effort
Japanese Kanji, Chinese -> these are not alphabets/syllaberies, so you're not in fact transliterating, you're looking up the pronunciation of each symbol in a hopefully large dictionary (EDICT and CCDICT should work), and a lot of times you'll get it wrong unless you're also considering the context, especially in Japanese
Korean -> technically an alphabet, but computers can only handle the composed characters, so you need another large database, I'm not aware of any
Arabic, Hebrew -> these languages don't write down short vowels, so a lot of times your transliteration will be something unreadable like "bytlhm" (Bethlehem). I'm not aware of any large databases that map Arabic or Hebrew words to their pronunciation.