Fastest way to strip all non-printable characters from a Java String

If it is reasonable to embed this method in a class which is not shared across threads, then you can reuse the buffer:

char [] oldChars = new char[5];

String stripControlChars(String s)
{
    final int inputLen = s.length();
    if ( oldChars.length < inputLen )
    {
        oldChars = new char[inputLen];
    }
    s.getChars(0, inputLen, oldChars, 0);

etc...

This is a big win - 20% or so, as I understand the current best case.

If this is to be used on potentially large strings and the memory "leak" is a concern, a weak reference can be used.


using 1 char array could work a bit better

int length = s.length();
char[] oldChars = new char[length];
s.getChars(0, length, oldChars, 0);
int newLen = 0;
for (int j = 0; j < length; j++) {
    char ch = oldChars[j];
    if (ch >= ' ') {
        oldChars[newLen] = ch;
        newLen++;
    }
}
s = new String(oldChars, 0, newLen);

and I avoided repeated calls to s.length();

another micro-optimization that might work is

int length = s.length();
char[] oldChars = new char[length+1];
s.getChars(0, length, oldChars, 0);
oldChars[length]='\0';//avoiding explicit bound check in while
int newLen=-1;
while(oldChars[++newLen]>=' ');//find first non-printable,
                       // if there are none it ends on the null char I appended
for (int  j = newLen; j < length; j++) {
    char ch = oldChars[j];
    if (ch >= ' ') {
        oldChars[newLen] = ch;//the while avoids repeated overwriting here when newLen==j
        newLen++;
    }
}
s = new String(oldChars, 0, newLen);