Remove all special characters and case from string in bash
cat yourfile.txt | tr -dc '[:alnum:]\n\r' | tr '[:upper:]' '[:lower:]'
The first tr
deletes special characters. d
means delete, c
means complement (invert the character set). So, -dc
means delete all characters except those specified. The \n
and \r
are included to preserve linux or windows style newlines, which I assume you want.
The second one translates uppercase characters to lowercase.
Pure BASH 4+ solution:
$ filename='Some_randoM data1-A'
$ f=${filename//[^[:alnum:]]/}
$ echo "$f"
SomerandoMdata1A
$ echo "${f,,}"
somerandomdata1a
A function for this:
clean() {
local a=${1//[^[:alnum:]]/}
echo "${a,,}"
}
Try it:
$ clean "More Data0"
moredata0
if you are using mkelement0 and Dan Bliss approach. You can also look into sed + POSIX regular expression.
cat yourfile.txt | sed 's/[^a-zA-Z0-9]//g'
Sed matches all other characters that are not contained within the brackets except letters and numbers and remove them.
I've used tr
to remove any characters that are not part of [:print:]
class
cat file.txt | tr -dc '[:print:]'
or
echo "..." | tr -dc '[:print:]'
Additionally you might want to |
(pipe) the output to od -c
to confirm the result
cat file.txt | tr -dc '[:print:]' | od -c