shortest way to replace characters in a variable

Let's see. The shortest I can come up with is a tweak of your tr solution:

OUTPUT="$(tr -d "\"\`'" <<<$OUTPUT)"

Other alternatives include the already mentioned variable substitution which can be shorter than shown so far:

OUTPUT="${OUTPUT//[\'\"\`]}"

And sed of course though this is longer in terms of characters:

OUTPUT="$(sed s/[\'\"\`]//g <<<$OUTPUT)"

I'm not sure if you mean shortest in length or in terms of time taken. In terms of length these two are as short as it gets (or as I can get it anyway) when it comes to removing those specific characters. So, which is fastest? I tested by setting the OUTPUT variable to what you had in your example but repeated several dozen times:

$ echo ${#OUTPUT} 
4900

$ time tr -d "\"\`'" <<<$OUTPUT
real    0m0.002s
user    0m0.004s
sys     0m0.000s
$ time sed s/[\'\"\`]//g <<<$OUTPUT
real    0m0.005s
user    0m0.000s
sys     0m0.000s
$ time echo ${OUTPUT//[\'\"\`]}
real    0m0.027s
user    0m0.028s
sys     0m0.000s

As you can see, the tr is clearly the fastest, followed closely by sed. Also, it seems like using echo is actually slightly faster than using <<<:

$ for i in {1..10}; do 
    ( time echo $OUTPUT | tr -d "\"\`'" > /dev/null ) 2>&1
done | grep -oP 'real.*m\K[\d.]+' | awk '{k+=$1;} END{print k/NR}'; 
0.0025
$ for i in {1..10}; do 
    ( time tr -d "\"\`'" <<<$OUTPUT > /dev/null ) 2>&1 
  done | grep -oP 'real.*m\K[\d.]+' | awk '{k+=$1;} END{print k/NR}'; 
0.0029

Since the difference is tiny, I ran the above tests 10 times for each of the two and it turns out that the fastest is indeed the one you had to begin with:

echo $OUTPUT | tr -d "\"\`'"

However, this changes when you take into account the overhead of assigning to a variable, here, using tr is slightly slower than the simple replacement:

$ for i in {1..10}; do
    ( time OUTPUT=${OUTPUT//[\'\"\`]} ) 2>&1
  done | grep -oP 'real.*m\K[\d.]+' | awk '{k+=$1;} END{print k/NR}'; 
0.0032

$ for i in {1..10}; do
    ( time OUTPUT=$(echo $OUTPUT | tr -d "\"\`'")) 2>&1
  done | grep -oP 'real.*m\K[\d.]+' | awk '{k+=$1;} END{print k/NR}'; 
0.0044

So, in conclusion, when you simply want to view the results, use tr but if you want to reassign to a variable, using the shell's string manipulation features is faster since they avoid the overhead of running a separate subshell.

You could use variable substitution:

$ OUTPUT=a\'b\"c\`d
$ echo "$OUTPUT"
a'b"c`d

Use that syntax: ${parameter//pattern/string} to replace all occurrences of the pattern with the string.

$ echo "${OUTPUT//\'/x}"
axb"c`d
$ echo "${OUTPUT//\"/x}"
a'bxc`d
$ echo "${OUTPUT//\`/x}"
a'b"cxd
$ echo "${OUTPUT//[\'\"\`]/x}"
axbxcxd

In bash or zsh it is:

OUTPUT="${OUTPUT//[\`\"\']/}"

Note that ${VAR//PATTERN/} removes all instances of the pattern. For more information bash parameter expansion

This solution should be fastest for short strings because it doesn't involve running any external programs. However for very long strings the opposite is true -- it is better to use dedicated tool for text operations, for example:

$ OUTPUT="$(cat /usr/src/linux/.config)"

$ time (echo $OUTPUT | OUTPUT="${OUTPUT//set/abc}")
real    0m1.766s
user    0m1.681s
sys     0m0.002s

$ time (echo $OUTPUT | sed s/set/abc/g >/dev/null)
real    0m0.094s
user    0m0.078s
sys     0m0.006s

shortest way to replace characters in a variable

Tags:

String

Bash

Variable

Related

Recent Posts