Select unique or distinct values from a list in UNIX shell script

With zsh you can do this:

% cat infile 
tar
more than one word
gz
java
gz
java
tar
class
class
zsh-5.0.0[t]% print -l "${(fu)$(<infile)}"
tar
more than one word
gz
java
class

Or you can use AWK:

% awk '!_[$0]++' infile    
tar
more than one word
gz
java
class

With AWK you can do:

 ./yourscript.ksh | awk '!a[$0]++'

I find it faster than sort and uniq

./script.sh | sort -u

This is the same as monoxide's answer, but a bit more concise.

You might want to look at the uniq and sort applications.

./yourscript.ksh | sort | uniq

(FYI, yes, the sort is necessary in this command line, uniq only strips duplicate lines that are immediately after each other)

EDIT:

Contrary to what has been posted by Aaron Digulla in relation to uniq's commandline options:

Given the following input:

class
jar
jar
jar
bin
bin
java

uniq will output all lines exactly once:

class
jar
bin
java

uniq -d will output all lines that appear more than once, and it will print them once:

jar
bin

uniq -u will output all lines that appear exactly once, and it will print them once:

class
java

Related