Remove all special characters and case from string in bash

cat yourfile.txt | tr -dc '[:alnum:]\n\r' | tr '[:upper:]' '[:lower:]'

The first tr deletes special characters. d means delete, c means complement (invert the character set). So, -dc means delete all characters except those specified. The \n and \r are included to preserve linux or windows style newlines, which I assume you want.

The second one translates uppercase characters to lowercase.


Pure BASH 4+ solution:

$ filename='Some_randoM data1-A'
$ f=${filename//[^[:alnum:]]/}
$ echo "$f"
SomerandoMdata1A
$ echo "${f,,}"
somerandomdata1a

A function for this:

clean() {
    local a=${1//[^[:alnum:]]/}
    echo "${a,,}"
}

Try it:

$ clean "More Data0"
moredata0

if you are using mkelement0 and Dan Bliss approach. You can also look into sed + POSIX regular expression.

cat yourfile.txt | sed 's/[^a-zA-Z0-9]//g'

Sed matches all other characters that are not contained within the brackets except letters and numbers and remove them.


I've used tr to remove any characters that are not part of [:print:] class

cat file.txt | tr -dc '[:print:]'

or

echo "..." | tr -dc '[:print:]'

Additionally you might want to | (pipe) the output to od -c to confirm the result

cat file.txt | tr -dc '[:print:]' | od -c