How to get only the unique results without having to sort data?
perl -ne 'print unless $seen{$_}++' data.txt
Or, if you must have a useless use of cat
:
cat data.txt | perl -ne 'print unless $seen{$_}++'
Here's an awk
translation, for systems that lack Perl:
awk '!seen[$0]++' data.txt
cat data.txt | awk '!seen[$0]++'
john has a tool called unique
:
usr@srv % cat data.txt | unique out
usr@srv % cat out
aaaaaa
cccccc
bbbbbb
To achieve the same without additional tools in a single commandline is a bit more complex:
usr@srv % cat data.txt | nl | sort -k 2 | uniq -f 1 | sort -n | sed 's/\s*[0-9]\+\s\+//'
aaaaaa
cccccc
bbbbbb
nl
prints line numbers in front of the lines, so if we sort
/uniq
behind them, we can restore the original order of the lines. sed
just deletes the line numbers afterwards ;)
I prefer to use this:
cat -n data.txt | sort --key=2.1 -b -u | sort -n | cut -c8-
cat -n
adds line numbers,
sort --key=2.1 -b -u
sorts on the second field (after the added line numbers), ignoring leading blanks, keeping unique lines
sort -n
sorts in strict numeric order
cut -c8-
keep all characters from column 8 to EOL (i.e., omit the line numbers we included)