Convert NSData bytes to NSString?
How about
NSString *content = [[[NSString alloc] initWithData:myData
encoding:NSUTF8StringEncoding] autorelease];
That's an important point that should be re-emphasized I think. It turns out that,
NSString *content = [NSString stringWithUTF8String:[responseData bytes]];
is not the same as,
NSString *content = [[NSString alloc] initWithBytes:[responseData bytes]
length:[responseData length] encoding: NSUTF8StringEncoding];
the first expects a NULL terminated byte string, the second doesn't. In the above two cases content
will be NULL in the first example if the byte string isn't correctly terminated.
NSData *torrent = [BEncoding objectFromEncodedData:rawdata];
When I NSLog torrent I get the following:
{ ⋮ }
That would be an NSDictionary, then, not an NSData.
unsigned char aBuffer[[name length]]; [name getBytes:aBuffer length:[name length]]; NSLog(@"File name: %s", aBuffer);
..which retrives the data, but seems to have additional unicode rubbish after it:
File name: ubuntu-8.10-desktop-i386.iso)
No, it retrieved the filename just fine; you simply printed it incorrectly. %s
takes a C string, which is null-terminated; the bytes of a data object are not null-terminated (they are just bytes, not necessarily characters in any encoding, and 0—which is null as a character—is a perfectly valid byte). You would have to allocate one more character, and set the last one in the array to 0:
size_t length = [name length] + 1;
unsigned char aBuffer[length];
[name getBytes:aBuffer length:length];
aBuffer[length - 1] = 0;
NSLog(@"File name: %s", aBuffer);
But null-terminating the data in an NSData object is wrong (except when you really do need a C string). I'll get to the right way in a moment.
I have also tried […]..
NSString *secondtry = [NSString stringWithCharacters:[name bytes] length:[name length] / sizeof(unichar)];
..but this seems to return random Chinese characters:
扵湵畴㠭ㄮⴰ敤歳潴⵰㍩㘸椮潳
That's because your bytes are UTF-8, which encodes one character in (usually) one byte.
unichar
is, and stringWithCharacters:length:
accepts, UTF-16. In that encoding, one character is (usually) two bytes. (Hence the division by sizeof(unichar)
: it divides the number of bytes by 2 to get the number of characters.)
So you said “here's some UTF-16 data”, and it went and made characters from every two bytes; each pair of bytes was supposed to be two characters, not one, so you got garbage (which turned out to be mostly CJK ideographs).
You answered your own question pretty well, except that stringWithUTF8String:
is simpler than stringWithCString:encoding:
for UTF-8-encoded strings.
However, when you have the length (as you do when you have an NSData), it is even easier—and more proper—to use initWithBytes:length:encoding:
. It's easier because it does not require null-terminated data; it simply uses the length you already have. (Don't forget to release or autorelease it.)