faster strlen?

Actually, glibc's implementation of strlen is an interesting example of the vectorization approach. It is peculiar in that it doesn't use vector instructions, but finds a way to use only ordinary instructions on 32 or 64 bits words from the buffer.


Obviously, if your string has a known minimum length, you can begin your search at that position.

Beyond that, there's not really anything you can do; if you try to do something clever and find a \0 byte, you still need to check every byte between the start of the string and that point to make sure there was no earlier \0.

That's not to say that strlen can't be optimized. It can be pipelined, and it can be made to process word-size or vector chunks with each comparison. On most architectures, some combination of these and other approaches will yield a substantial constant-factor speedup over a naive byte-comparison loop. Of course, on most mature platforms, the system strlen is already implemented using these techniques.


Jack,

strlen works by looking for the ending '\0', here's an implementation taken from OpenBSD:

size_t
strlen(const char *str)
{
        const char *s;

        for (s = str; *s; ++s)
                ;
        return (s - str);
}

Now, consider that you know the length is about 200 characters, as you said. Say you start at 200 and loop up and down for a '\0'. You've found one at 204, what does it mean? That the string is 204 chars long? NO! It could end before that with another '\0' and all you did was look out of bounds.


Sure. Keep track of the length while you're writing to the string.