How to select multiple lines from a file or from pipe in a script?
You can use this awk:
awk -v s='2,4' 'BEGIN{split(s, a, ","); for (i in a) b[a[i]]} NR in b' file
two
four
Via a separate script lines.sh
:
#!/bin/bash
awk -v s="$1" 'BEGIN{split(s, a, ","); for (i in a) b[a[i]]} NR in b' "$2"
Then give execute permissions:
chmod +x lines.sh
And call it as:
./lines.sh '2,4' 'test.txt'
Try sed
:
sed -n '2p; 4p' inputFile
-n
tells sed
to suppress output, but for the lines 2
and 4
, the p
(print) command is used to print these lines.
You can also use ranges, e.g.:
sed -n '2,4p' inputFile
Two pure Bash versions. Since you're looking for general and reusable solutions, you might as well put a little bit of effort in that. (Also, see last section).
Version 1
This script slurps the entire stdin into an array (using mapfile
, so it's rather efficient) and then prints the lines specified on its arguments. Ranges are valid, e.g.,
1-4 # for lines 1, 2, 3 and 4
3- # for everything from line 3 till the end of the file
You may separate these by spaces or commas. The lines are printed exactly in the order the arguments are given:
lines 1 1,2,4,1-3,4- 1
will print line 1 twice, then line 2, then line 4, then lines 1, 2 and 3, then everything from line 4 till the end, and finally, line 1 again.
Here you go:
#!/bin/bash
lines=()
# Slurp stdin in array
mapfile -O1 -t lines
# Arguments:
IFS=', ' read -ra args <<< "$*"
for arg in "${args[@]}"; do
if [[ $arg = +([[:digit:]]) ]]; then
arg=$arg-$arg
fi
if [[ $arg =~ ([[:digit:]]+)-([[:digit:]]*) ]]; then
((from=10#${BASH_REMATCH[1]}))
((to=10#${BASH_REMATCH[2]:-$((${#lines[@]}))}))
((from==0)) && from=1
((to>=${#lines[@]})) && to=${#lines[@]}
((from<=to)) || printf >&2 'Argument %d-%d: lines not in increasing order' "$from" "$to"
for((i=from;i<=to;++i)); do
printf '%s\n' "${lines[i]}"
done
else
printf >&2 "Error in argument \`%s'.\n" "$arg"
fi
done
- Pro: It's really cool.
- Con: Needs to read entire stream into memory. Not suitable for infinite streams.
Version 2
This version addresses the previous problem of infinite streams. But you'll lose the ability to repeat and reorder lines.
Same thing, ranges are allowed:
lines 1 1,4-6 9-
will print lines 1, 4, 5, 6, 9 and everything till the end. If the set of lines is bounded, exits as soon as last line is read.
#!/bin/bash
lines=()
tillend=0
maxline=0
# Process arguments
IFS=', ' read -ra args <<< "$@"
for arg in "${args[@]}"; do
if [[ $arg = +([[:digit:]]) ]]; then
arg=$arg-$arg
fi
if [[ $arg =~ ([[:digit:]]+)-([[:digit:]]*) ]]; then
((from=10#${BASH_REMATCH[1]}))
((from==0)) && from=1
((tillend && from>=tillend)) && continue
if [[ -z ${BASH_REMATCH[2]} ]]; then
tillend=$from
continue
fi
((to=10#${BASH_REMATCH[2]}))
if ((from>to)); then
printf >&2 "Invalid lines order: %s\n" "$arg"
exit 1
fi
((maxline<to)) && maxline=$to
for ((i=from;i<=to;++i)); do
lines[i]=1
done
else
printf >&2 "Invalid argument \`%s'\n" "$arg"
exit 1
fi
done
# If nothing to read, exit
((tillend==0 && ${#lines[@]}==0)) && exit
# Now read stdin
linenb=0
while IFS= read -r line; do
((++linenb))
((tillend==0 && maxline && linenb>maxline)) && exit
if [[ ${lines[linenb]} ]] || ((tillend && linenb>=tillend)); then
printf '%s\n' "$line"
fi
done
- Pro: It's really cool and doesn't read the full stream in memory.
- Con: Can't repeat or reorder lines as Version 1. Speed is not is it's strongest point.
Further thoughts
If you really want an awesome general script that does what Version 1 and Version 2 does, and more, you definitely should consider using another language, e.g., Perl: you'll gain a lot (in particular speed)! you'll be able to have nice options that'll do lots of much cooler stuff. It might be worth it in the long run, as you want a general and reusable script. You might even end up having a script that reads emails!
Disclaimer. I haven't thoroughly checked these scripts... so beware of bugs!