Linux shell command to filter a text file by line length
Solution 1:
awk '{ if (length($0) < 16384) print }' yourfile >your_output_file.txt
would print lines shorter than 16 kilobytes, as in your own example.
Or if you fancy Perl:
perl -nle 'if (length($_) < 16384) { print }' yourfile >your_output_file.txt
Solution 2:
This is similar to Ansgar's answer, but slightly faster in my tests:
awk 'length($0) < 16384' infile >outfile
It's the same speed as the other awk answers. It relies on the implicit print
of a true expression, but doesn't need to take the time to split the line as Ansgar's does.
Note that AWK gives you an if
for free. The command above is equivalent to:
awk 'length($0) < 16384 {print}' infile >outfile
There's no explicit if
(or its surrounding set of curly braces) as in some of the other answers.
Here is a way to do it in sed
:
sed '/.\{16384\}/d' infile >outfile
or:
sed -r '/.{16384}/d' infile >outfile
which delete any line that contains 16384 (or more) characters.
For completeness, here's how you'd use sed
to save lines longer than your threshold:
sed '/^.\{0,16383\}$/d' infile >outfile
Solution 3:
Not really different from the answers already given, but shorter still:
awk -F '' 'NF < 16384' infile >outfile
Solution 4:
You can awk
such as:
$ awk '{ if (length($0) < 16384) { print } }' /path/to/text/file
This will print the lines longer shorter than 16K characters (16 * 1024).
You can use grep
also:
$ grep ".\{,16384\}" /path/to/text/file
This will print the lines at most 16K characters.