Is there a standard alternative to sponge to pipe a file into itself?
A shell function replacing sponge
:
mysponge () (
append=false
while getopts 'a' opt; do
case $opt in
a) append=true ;;
*) echo error; exit 1
esac
done
shift "$(( OPTIND - 1 ))"
outfile=$1
tmpfile=$(mktemp "$(dirname "$outfile")/tmp-sponge.XXXXXXXX") &&
cat >"$tmpfile" &&
if "$append"; then
cat "$tmpfile" >>"$outfile"
else
if [ -f "$outfile" ]; then
chmod --reference="$outfile" "$tmpfile"
fi
if [ -f "$outfile" ]; then
mv "$tmpfile" "$outfile"
elif [ -n "$outfile" ] && [ ! -e "$outfile" ]; then
cat "$tmpfile" >"$outfile"
else
cat "$tmpfile"
fi
fi &&
rm -f "$tmpfile"
)
This mysponge
shell function passes all data available on standard input on to a temporary file.
When all data has been redirected to the temporary file, the collected data is copied to the file named by the function's argument. If data is not to be appended to the file (i.e -a
is not used), and if the given output filename refers to an existing regular file, if it does not exist, then this is done with mv
(in the case that the file is an existing regular file, an attempt is made to transfer the file modes to the temporary file using GNU chmod
first). If the output is to something that is not a regular file (a named pipe, standard output etc.), the data is outputted with cat
.
If no file was given on the command line, the collected data is sent to standard output.
At the end, the temporary file is removed.
Each step in the function relies on the successful completion of the previous step. No attempt is made to remove the temporary file if one command fails (it may contain important data).
If the named file does not exist, then it will be created with the user's default permissions etc., and the data arriving from standard input will be written to it.
The mktemp
utility is not standard, but it is commonly available.
The above function mimics the behaviour described in the manual for sponge
from the moreutils
package on Debian.
Using tee
in place of sponge
would not be a viable option. You say that you've tried it and it seemed to work for you. It may work and it may not. It relies on the timing of when the commands in the pipeline are started (they are started pretty much concurrently), and the size of the input data file.
The following is an example showing a situation where using tee
would not work.
The original file is 200000 bytes, but after the pipeline, it's truncated to 32 KiB (which could well correspond to some buffer size on my system).
$ yes | head -n 100000 >hello
$ ls -l hello
-rw-r--r-- 1 kk wheel 200000 Jan 10 09:45 hello
$ cat hello | tee hello >/dev/null
$ ls -l hello
-rw-r--r-- 1 kk wheel 32768 Jan 10 09:46 hello