Command to display first few and last few lines of a file
@rush is right about using head + tail being more efficient for large files, but for small files (< 20 lines), some lines may be output twice.
{ head; tail;} < /path/to/file
would be equally efficient, but wouldn't have the problem above.
You can use sed
or awk
to make it with one command. However you'll loose at speed, cause sed
and awk
will need to run through the whole file anyway.
From a speed point of view it's much better to make a function or every time to combination of tail
+ head
. This does have the downside of not working if the input is a pipe, however you can use proccess substitution, in case your shell supports it (look at example below).
first_last () {
head -n 10 -- "$1"
tail -n 10 -- "$1"
}
and just launch it as
first_last "/path/to/file_to_process"
to proceed with process substitution (bash, zsh, ksh like shells only):
first_last <( command )
ps. you can even add a grep
to check if your "global conditions" exist.
The { head; tail; }
solution wouldn't work on pipes (or sockets or any other non-seekable files) because head
could consume too much data as it reads by blocks and can't seek back on a pipe potentially leaving the cursor inside the file beyond what tail
is meant to select.
So, you could use a tool that reads one character at a time like the shell's read
(here using a function that takes the number of head lines and tail lines as arguments).
head_tail() {
n=0
while [ "$n" -lt "$1" ]; do
IFS= read -r line || { printf %s "$line"; break; }
printf '%s\n' "$line"
n=$(($n + 1))
done
tail -n "${2-$1}"
}
seq 100 | head_tail 5 10
seq 20 | head_tail 5
or implement tail
in awk for instance as:
head_tail() {
awk -v h="$1" -v t="${2-$1}" '
{l[NR%t]=$0}
NR<=h
END{
n=NR-t+1
if(n <= h) n = h+1
for (;n<=NR;n++) print l[n%t]
}'
}
With sed
:
head_tail() {
sed -e "1,${1}b" -e :1 -e "$(($1+${2-$1})),\$!{N;b1" -e '}' -e 'N;D'
}
(though beware that some sed
implementations have a low limitation on the size of their pattern space, so would fail for big values of the number of tail lines).