How to select multiple lines from a file or from pipe in a script?

You can use this awk:

awk -v s='2,4' 'BEGIN{split(s, a, ","); for (i in a) b[a[i]]} NR in b' file
two
four

Via a separate script lines.sh:

#!/bin/bash
awk -v s="$1" 'BEGIN{split(s, a, ","); for (i in a) b[a[i]]} NR in b' "$2"

Then give execute permissions:

chmod +x lines.sh

And call it as:

./lines.sh '2,4' 'test.txt'

Try sed:

sed -n '2p; 4p' inputFile

-n tells sed to suppress output, but for the lines 2 and 4, the p (print) command is used to print these lines.

You can also use ranges, e.g.:

sed -n '2,4p' inputFile

Two pure Bash versions. Since you're looking for general and reusable solutions, you might as well put a little bit of effort in that. (Also, see last section).

Version 1

This script slurps the entire stdin into an array (using mapfile, so it's rather efficient) and then prints the lines specified on its arguments. Ranges are valid, e.g.,

1-4 # for lines 1, 2, 3 and 4
3-  # for everything from line 3 till the end of the file

You may separate these by spaces or commas. The lines are printed exactly in the order the arguments are given:

lines 1 1,2,4,1-3,4- 1

will print line 1 twice, then line 2, then line 4, then lines 1, 2 and 3, then everything from line 4 till the end, and finally, line 1 again.

Here you go:

#!/bin/bash

lines=()

# Slurp stdin in array
mapfile -O1 -t lines

# Arguments:
IFS=', ' read -ra args <<< "$*"

for arg in "${args[@]}"; do
   if [[ $arg = +([[:digit:]]) ]]; then
      arg=$arg-$arg
   fi
   if [[ $arg =~ ([[:digit:]]+)-([[:digit:]]*) ]]; then
      ((from=10#${BASH_REMATCH[1]}))
      ((to=10#${BASH_REMATCH[2]:-$((${#lines[@]}))}))
      ((from==0)) && from=1
      ((to>=${#lines[@]})) && to=${#lines[@]}
      ((from<=to)) || printf >&2 'Argument %d-%d: lines not in increasing order' "$from" "$to"
      for((i=from;i<=to;++i)); do
         printf '%s\n' "${lines[i]}"
      done
   else
      printf >&2 "Error in argument \`%s'.\n" "$arg"
   fi
done

Pro: It's really cool.
Con: Needs to read entire stream into memory. Not suitable for infinite streams.

Version 2

This version addresses the previous problem of infinite streams. But you'll lose the ability to repeat and reorder lines.

Same thing, ranges are allowed:

lines 1 1,4-6 9-

will print lines 1, 4, 5, 6, 9 and everything till the end. If the set of lines is bounded, exits as soon as last line is read.

#!/bin/bash

lines=()
tillend=0
maxline=0

# Process arguments
IFS=', ' read -ra args <<< "$@"

for arg in "${args[@]}"; do
   if [[ $arg = +([[:digit:]]) ]]; then
       arg=$arg-$arg
   fi
   if [[ $arg =~ ([[:digit:]]+)-([[:digit:]]*) ]]; then
      ((from=10#${BASH_REMATCH[1]}))
      ((from==0)) && from=1
      ((tillend && from>=tillend)) && continue
      if [[ -z ${BASH_REMATCH[2]} ]]; then
         tillend=$from
         continue
      fi
      ((to=10#${BASH_REMATCH[2]}))
      if ((from>to)); then
         printf >&2 "Invalid lines order: %s\n" "$arg"
         exit 1
      fi
      ((maxline<to)) && maxline=$to
      for ((i=from;i<=to;++i)); do
         lines[i]=1
      done
   else
      printf >&2 "Invalid argument \`%s'\n" "$arg"
      exit 1
   fi
done

# If nothing to read, exit
((tillend==0 && ${#lines[@]}==0)) && exit

# Now read stdin
linenb=0
while IFS= read -r line; do
   ((++linenb))
   ((tillend==0 && maxline && linenb>maxline)) && exit
   if [[ ${lines[linenb]} ]] || ((tillend && linenb>=tillend)); then
      printf '%s\n' "$line"
   fi
done

Pro: It's really cool and doesn't read the full stream in memory.
Con: Can't repeat or reorder lines as Version 1. Speed is not is it's strongest point.

Further thoughts

If you really want an awesome general script that does what Version 1 and Version 2 does, and more, you definitely should consider using another language, e.g., Perl: you'll gain a lot (in particular speed)! you'll be able to have nice options that'll do lots of much cooler stuff. It might be worth it in the long run, as you want a general and reusable script. You might even end up having a script that reads emails!

Disclaimer. I haven't thoroughly checked these scripts... so beware of bugs!

How to select multiple lines from a file or from pipe in a script?

Version 1

Version 2

Further thoughts

Tags:

Linux

Regex

Bash

Sed

Text Processing

Related

Recent Posts