Bash: Filter rows by line number

Simply with awk:

awk 'NR==FNR{ pos[$1]; next }FNR in pos' positions.txt data.txt

NR==FNR{ ... } - processing the 1st input file (i.e. positions.txt):
- pos[$1] - accumulating positions(record numbers) set as pos array keys
- next - jump to next record
FNR in pos - while processing the 2nd input file data.txt(FNR indicates how many records have been read from the current input file). Print record only if current record number FNR is in array of positions pos (search on keys)

Sample output:

Click to copy

667 ffg wew  23
533 jhf qwe  54
...

First create a sed script from the positions.txt file:

Click to copy

sed 's/$/p/' positions.txt

This will output

Click to copy

3p
5p
8p

This simple script will just print the indicated lines.

Then apply this to the data.txt file. If you're using bash (or any shell that understands process substitutions with <( ... )):

Click to copy

sed -n -f <( sed 's/$/p/' positions.txt ) data.txt

The -n stops sed from outputting anything other than what's explicitly printed by the given sed script.

With the examples given, this will yield

Click to copy

667 ffg wew  23
533 jhf qwe  54

If not using bash, then

Click to copy

sed 's/$/p/' positions.txt >filter.sed
sed -n -f filter.sed data.txt
rm -f filter.sed

... will do the same thing.

If positions.txt is sorted, it's also possible to do this in a single pass through both files, and without storing positions.txt in full. Simply read the next line off positions.txt when the previous matching line is met:

Click to copy

$ awk -vpos=positions.txt 'function get() { getline num < pos } 
     BEGIN { get() } NR==num { print; get() }' data.txt                 
667 ffg wew  23
533 jhf qwe  54

In practice, this is only useful if both files are really huge or you're really, really low on memory.

Bash: Filter rows by line number

Tags:

Awk

Sed

Text Processing

Related

Recent Posts