MarshalAs(UnmanagedType.LPStr) - how does this convert utf-8 strings to char*
[MarshalAs(UnmanagedType.LPStr)] - how does this convert utf-8 strings to char* ?
It doesn't. There is no such thing as a "utf-8 string" in managed code, strings are always encoded in utf-16. The marshaling from and to an LPStr is done with the default system code page. Which makes it fairly remarkable that you see Korean glyphs in the debugger, unless you use code page 949.
If interop with utf-8 is a hard requirement then you need to use a byte[] in the pinvoke declaration. And convert back and forth yourself with System.Text.Encoding.UTF8. Use its GetString() method to convert the byte[] to a string, its GetBytes() method to convert a string to byte[]. Avoid all this if possible by using wchar_t[] in the native code.
While the other answers are correct, there has been a major development in .NET 4.7. Now there is an option that does exactly what UTF-8 needs: UnmanagedType.LPUTF8Str
. I tried it and it works like a Swiss chronometre, doing exactly what it sounds like.
In fact, I even used MarshalAs(UnmanagedType.LPUTF8Str)
in one parameter and MarshalAs(UnmanagedType.LPStr)
in another. Also works. Here is my method (takes in string parameters and returns a string via a parameter):
[DllImport("mylib.dll", ExactSpelling = true, CallingConvention = CallingConvention.StdCall)]
public static extern void ProcessContent([MarshalAs(UnmanagedType.LPUTF8Str)]string content,
[MarshalAs(UnmanagedType.LPUTF8Str), Out]StringBuilder outputBuffer,[MarshalAs(UnmanagedType.LPStr)]string settings);
Thanks, Microsoft! Another nuisance is gone.
If you need to marshal UTF-8 string
do it manually.
Define function with IntPtr
instead of string:
somefunction(IntPtr text)
Then convert text to zero-terminated UTF8 array of bytes and write them to IntPtr
:
byte[] retArray = Encoding.UTF8.GetBytes(text);
byte[] retArrayZ = new byte[retArray.Length + 1];
Array.Copy(retArray, retArrayZ, retArray.Length);
IntPtr retPtr = AllocHGlobal(retArrayZ.Length);
Marshal.Copy(retArrayZ, 0, retPtr, retArrayZ.Length);
somefunction(retPtr);