Is there a standard alternative to sponge to pipe a file into itself?

A shell function replacing sponge:

mysponge () (
    append=false

    while getopts 'a' opt; do
        case $opt in
            a) append=true ;;
            *) echo error; exit 1
        esac
    done
    shift "$(( OPTIND - 1 ))"

    outfile=$1

    tmpfile=$(mktemp "$(dirname "$outfile")/tmp-sponge.XXXXXXXX") &&
    cat >"$tmpfile" &&
    if "$append"; then
        cat "$tmpfile" >>"$outfile"
    else
        if [ -f "$outfile" ]; then
            chmod --reference="$outfile" "$tmpfile"
        fi
        if [ -f "$outfile" ]; then
            mv "$tmpfile" "$outfile"
        elif [ -n "$outfile" ] && [ ! -e "$outfile" ]; then
            cat "$tmpfile" >"$outfile"
        else
            cat "$tmpfile"
        fi
    fi &&
    rm -f "$tmpfile"
)

This mysponge shell function passes all data available on standard input on to a temporary file.

When all data has been redirected to the temporary file, the collected data is copied to the file named by the function's argument. If data is not to be appended to the file (i.e -a is not used), and if the given output filename refers to an existing regular file, if it does not exist, then this is done with mv (in the case that the file is an existing regular file, an attempt is made to transfer the file modes to the temporary file using GNU chmod first). If the output is to something that is not a regular file (a named pipe, standard output etc.), the data is outputted with cat.

If no file was given on the command line, the collected data is sent to standard output.

At the end, the temporary file is removed.

Each step in the function relies on the successful completion of the previous step. No attempt is made to remove the temporary file if one command fails (it may contain important data).

If the named file does not exist, then it will be created with the user's default permissions etc., and the data arriving from standard input will be written to it.

The mktemp utility is not standard, but it is commonly available.

The above function mimics the behaviour described in the manual for sponge from the moreutils package on Debian.

Using tee in place of sponge would not be a viable option. You say that you've tried it and it seemed to work for you. It may work and it may not. It relies on the timing of when the commands in the pipeline are started (they are started pretty much concurrently), and the size of the input data file.

The following is an example showing a situation where using tee would not work.

The original file is 200000 bytes, but after the pipeline, it's truncated to 32 KiB (which could well correspond to some buffer size on my system).

$ yes | head -n 100000 >hello
$ ls -l hello
-rw-r--r--  1 kk  wheel  200000 Jan 10 09:45 hello

$ cat hello | tee hello >/dev/null
$ ls -l hello
-rw-r--r--  1 kk  wheel  32768 Jan 10 09:46 hello

Is there a standard alternative to sponge to pipe a file into itself?

Tags:

Pipe

Tee

Related

Recent Posts