How to get only the unique results without having to sort data?

perl -ne 'print unless $seen{$_}++' data.txt

Or, if you must have a useless use of cat:

cat data.txt | perl -ne 'print unless $seen{$_}++'

Here's an awk translation, for systems that lack Perl:

awk '!seen[$0]++' data.txt
cat data.txt | awk '!seen[$0]++'

john has a tool called unique:

usr@srv % cat data.txt | unique out
usr@srv % cat out
aaaaaa
cccccc
bbbbbb

To achieve the same without additional tools in a single commandline is a bit more complex:

usr@srv % cat data.txt | nl | sort -k 2 | uniq -f 1 | sort -n | sed 's/\s*[0-9]\+\s\+//'
aaaaaa
cccccc
bbbbbb

nl prints line numbers in front of the lines, so if we sort/uniq behind them, we can restore the original order of the lines. sed just deletes the line numbers afterwards ;)

I prefer to use this:

cat -n data.txt | sort --key=2.1 -b -u | sort -n | cut -c8-

cat -n adds line numbers,

sort --key=2.1 -b -u sorts on the second field (after the added line numbers), ignoring leading blanks, keeping unique lines

sort -n sorts in strict numeric order

cut -c8- keep all characters from column 8 to EOL (i.e., omit the line numbers we included)

How to get only the unique results without having to sort data?

Tags:

Text Processing

Uniq

Related

Recent Posts