How I can print the wchar_t values to console?
Can I suggest std::wcout
?
So, something like this:
std::cout << "ASCII and ANSI" << std::endl;
std::wcout << L"INSERT MULTIBYTE WCHAR* HERE" << std::endl;
You might find more information in a related question here.
Windows has the very confusing information. You should learn C/C++ concept from Unix/Linux before programming in Windows.
wchar_t stores character in UTF-16 which is a fixed 16-bit memory size called wide character but wprintf() or wcout() will never print non-english wide characters correctly because no console will output in UTF-16. Windows will output in current locale while unix/linux will output in UTF-8, all are multi-byte. So you have to convert wide characters to multi-byte before printing. The unix command wcstombs() doesn't work on Windows, use WideCharToMultiByte() instead.
First you need to convert file to UTF-8 using notepad or other editor. Then install font in command prompt console so that it can read/write in your language and change code page in console to UTF-8 to display correctly by typing in the command prompt "chcp 65001" while cygwin is already default to UTF-8. Here is what I did in Thai.
#include <windows.h>
#include <stdio.h>
int main()
{
wchar_t* in=L"ทดสอบ"; // thai language
char* out=(char *)malloc(15);
WideCharToMultiByte(874, 0, in, 15, out, 15, NULL, NULL);
printf(out); // result is correctly in Thai although not neat
}
Note that 874=(Thai) code page in the operating system, 15=size of string
My suggestion is to avoid printing non-english wide characters to console unless necessary because it is not easy.
You cannot portably print wide strings using standard C++ facilities.
Instead you can use the open-source {fmt} library to portably print Unicode text. For example (https://godbolt.org/z/nccb6j):
#include <fmt/core.h>
int main() {
const char en[] = "Hello";
const char ru[] = "Привет";
fmt::print("{}\n{}\n", ru, en);
}
prints
Привет
Hello
This requires compiling with the /utf-8
compiler option in MSVC.
For comparison, writing to wcout
on Linux:
wchar_t en[] = L"Hello";
wchar_t ru[] = L"Привет";
std::wcout << ru << std::endl << en;
may transliterate the Russian text into Latin (https://godbolt.org/z/za5zP8):
Privet
Hello
This particular issue can be fixed by switching to a locale that uses UTF-8 but a similar problem exists on Windows that cannot be fixed just with standard facilities.
Disclaimer: I'm the author of {fmt}.
Edit: This doesn’t work if you are trying to write text that cannot be represented in your default locale. :-(
Use std::wcout
instead of std::cout
.
wcout << ru << endl << en;