What is the point of uniq -u and what does it do?
This ought to be easy to test:
$ cat file
1
2
3
3
4
4
$ uniq file
1
2
3
4
$ uniq -u file
1
2
In short, uniq
with no options removes all but one instance of consecutively duplicated lines. The GNU uniq
manual formulates that as
With no options, matching lines are merged to the first occurrence.
while POSIX says
[...] write one copy of each input line on the output. The second and succeeding copies of repeated adjacent input lines shall not be written.
With the -u
option, it removes all instances of consecutively duplicated lines, and leaves only the lines that were never duplicated. The GNU uniq
manual says
only print unique lines
and POSIX says
Suppress the writing of lines that are repeated in the input.
From uniq(1):
NAME uniq - report or omit repeated lines DESCRIPTION ... With no options, matching lines are merged to the first occurrence. ... -u, --unique only print unique lines
If we try it out we see:
$ cat file
cat
dog
dog
bird
$ uniq file
cat
dog
bird
$ uniq -u file
cat
bird
You can see that uniq
prints the first instance of a duplicated line. uniq -u
does not print any duplicated lines.
Considering the original poster's comment to the accepted answer, I believe that a different example may be useful to illustrate the difference and the point of the commands.
Let's say we have some portion of text, which has lines spaced with duplicate empty lines for some reason and with a single empty line at the beginning and the end:
$ cat declaration_quote.txt
We hold these truths to be self-evident, that all men are created equal, that
they are endowed by their Creator with certain unalienable Rights, that among
these are Life, Liberty and the pursuit of Happiness.
If you decide that one empty line is enough spacing, you can use uniq
to get
- each line which is not repeated immediately above and below (which are lines with text here and the single empty lines in the beginning and the end) and
- a line from each group of adjacent repeated lines (which are empty lines here, except for the one in the beginning and the one in the end).
It is not "everything only once", but rather "once from each continuous group" because you will receive a separate empty line from each group of the empty lines. That is already more than once. Also, the empty lines in the beginning and the end stay because there are no empty lines immediately above or below.
$ uniq declaration_quote.txt
We hold these truths to be self-evident, that all men are created equal, that
they are endowed by their Creator with certain unalienable Rights, that among
these are Life, Liberty and the pursuit of Happiness.
If you decide that you do not need such double spacing at all, you can use uniq -u
to get only each line which is not repeated immediately in the lines above or below. But it is still not "only things that appear once" because it will not remove the single empty lines (in the beginning and in the end), even though there are many other empty lines in the text. It is rather "only things not repeated immediately".
$ uniq -u declaration_quote.txt
We hold these truths to be self-evident, that all men are created equal, that
they are endowed by their Creator with certain unalienable Rights, that among
these are Life, Liberty and the pursuit of Happiness.