Is there a way to ignore header lines in a UNIX sort?

Here is a version that works on piped data:

(read -r; printf "%s\n" "$REPLY"; sort)

If your header has multiple lines:

(for i in $(seq $HEADER_ROWS); do read -r; printf "%s\n" "$REPLY"; done; sort)

This solution is from here

In simple cases, sed can do the job elegantly:

    your_script | (sed -u 1q; sort)

or equivalently,

    cat your_data | (sed -u 1q; sort)

The key is in the 1q -- print first line (header) and quit (leaving the rest of the input to sort).

For the example given, 2q will do the trick.

The -u switch (unbuffered) is required for those seds (notably, GNU's) that would otherwise read the input in chunks, thereby consuming data that you want to go through sort instead.

(head -n 2 <file> && tail -n +3 <file> | sort) > newfile

The parentheses create a subshell, wrapping up the stdout so you can pipe it or redirect it as if it had come from a single command.

If you don't mind using awk, you can take advantage of awk's built-in pipe abilities

eg.

extract_data | awk 'NR<3{print $0;next}{print $0| "sort -r"}'

This prints the first two lines verbatim and pipes the rest through sort.

Note that this has the very specific advantage of being able to selectively sort parts of a piped input. all the other methods suggested will only sort plain files which can be read multiple times. This works on anything.

Is there a way to ignore header lines in a UNIX sort?

Tags:

Unix

Sorting

Command Line

Related

Recent Posts