UNIX Uniq Uknow?
GNU sed, 496 485 + 2(rn flags) = 487 bytes
1h;1b;$bb;2{:r;$!N;$!br;s/.*/echo "&"|sort/e};:b;x;/[dD]/bd;/c/bc;/u/!b;x;s/^/\n/
s/(\n.*)\1+//g;s/\n//p;b;:d;x;/^(.*)\n\1/!D;:l;x;/D/{x;P;x};x;s/^[^\n]*\n//
/^(.*)\n\1/bl;P;D;:c;s/$/#/;G;x;/^$/bp;s/[^\n]*\n?//;x;s/#(\n[^\n]*).*/\1/
/([^:\n]*)\n\1$/{s/\n[^\n]*$//;:i;s/9(@*:)/@\1/;ti;s/8(@*:)/9\1/;s/7(@*:)/8\1/
s/6(@*:)/7\1/;s/5(@*:)/6\1/;s/4(@*:)/5\1/;s/3(@*:)/4\1/;s/2(@*:)/3\1/;s/1(@*:)/2\1/
s/0(@*:)/1\1/;s/@+:/1&/;y/@/0/;bc};s/:/ /;s/[^\n]*$/1:&/;bc;:p;x;s/...//;s/..$//;s/:/ /gp
The code size is not going to compete, but it was fun emulating the uniq
command. I wanted to add a simple selection sort to get rid of the sort
call, thus solving it in pure sed only, but the code got too long already. The input is read entirely from STDIN, with the flag expected on the first line.
Run:
(echo "-c";cat input_file)|sed -rnf uniq_emulator.sed
Output:
4 apple pie
1 dog
2 hello world
1 zebra
Initially submitted code and explanation: (sections are separated by an extra newline)
# The main
# store the flag in hold space and start next cycle
1h;1b
# usually a cycle starts with each line read, but I also start it manually,
#so I need to skip some lines that should run only once
$bb
2{
# read the entire input into pattern space
:r;$!N;$!br
# executes a shell call to sort the pattern space
s/.*/echo "&"|sort/e
}
# based on the flag, the appropriate jumps to "functions" are made
:b;x;/[dD]/bd;/c/bc;/u/!b
# Unique: -u
# all duplicate lines are deleted and what remains is printed
x;s/^/\n/;s/(\n.*)\1+//g
s/\n//p;b
# Duplication: -d and -D
# if the first 2 lines are different, delete the first and start new cycle
:d;x;/^(.*)\n\1/!D
:l
# if -D, print the first duplicate line
x;/D/{x;P;x};x
# otherwise (-d), delete the first duplicate line
s/^[^\n]*\n//
# loop condition, go back to :l if the first 2 lines are duplicate
/^(.*)\n\1/bl
# print only the first line and start a new cycle
P;D
# Counting: -c
# in order to count line occurrences, I simulate reading the input again
# In a loop I append all pattern space to hold space, delete first line of
#pattern space and from hold space all lines except the first one appended.
:c;s/$/#/;G
x;/^$/bp;s/[^\n]*\n?//;x;s/#(\n[^\n]*).*/\1/
# if the new line "read" is the same as the previous one, increment the
#counter of that line (format is "counter:line"), delete and go back to :c
/([^:\n]*)\n\1$/{s/\n[^\n]*$//;bi;:I;bc}
# otherwise, stop the counter for the previous line, add a counter for the
#current one and go back to :c
s/:/ /;s/[^\n]*$/1:&/;bc
# when the loop ends, the hold space with all counts is printed
:p;x;s/...//;s/..$//;s/:/ /gp
# Increment function that works by manually updating the changing digit(s)
:i;s/9(@*:)/@\1/;ti
s/8(@*:)/9\1/;s/7(@*:)/8\1/;s/6(@*:)/7\1/
s/5(@*:)/6\1/;s/4(@*:)/5\1/;s/3(@*:)/4\1/
s/2(@*:)/3\1/;s/1(@*:)/2\1/;s/0(@*:)/1\1/
s/@+:/1&/;y/@/0/;tI
R, 128 bytes
Using a different method than @Billywob and implemented as a unnamed function
function(s,o){R=rle(sort(s));cat(switch(o,'-d'=R$v[R$l>1],'-D'=s[s%in%R$v[R$l>1]],'-c'=paste(R$l,R$v),'-u'=R$v[R$l<2]),sep='
')}
Ungolfed
function(s,o){
R=rle(sort(s)); # do the run length encoding for the sorted input
cat( # output to STDOUT
switch(o # switch on option to provide output lines
,'-d'=R$v[R$l>1] # RLE values with length > 1
,'-D'=s[s%in%R$v[R$l>1]] # string values in RLE values with length > 1
,'-c'=paste(R$l,R$v) # RLE length and values pasted together
,'-u'=R$v[R$l<2] # RLE values with length < 2, eg 1
)
,sep='
' # set separator to carriage return
)
}
Test Run
function(s,o){R=rle(sort(s));cat(switch(o,'-d'=R$v[R$l>1],'-D'=s[s%in%R$v[R$l>1]],'-c'=paste(R$l,R$v),'-u'=R$v[R$l<2]),sep='
')}
> s = c("hello world", "apple pie", "zebra", "dog", "hello world", "apple pie", "apple pie", "apple pie")
> f =
function(s,o){R=rle(sort(s));cat(switch(o,'-d'=R$v[R$l>1],'-D'=s[s%in%R$v[R$l>1]],'-c'=paste(R$l,R$v),'-u'=R$v[R$l<2]),sep='
')}
> f(s, "-c")
4 apple pie
1 dog
2 hello world
1 zebra
> f(s, "-d")
apple pie
hello world
> f(s, "-D")
hello world
apple pie
hello world
apple pie
apple pie
apple pie
> f(s, "-u")
dog
zebra
>
Perl, 92 + 4 (-sn
flags) = 96 bytes
$h{$_}++}{$v=$h{$_},print$c?"$v $_":($D||$d)&&$v>1||$u&&$v<2?$_ x($d?1:$v):''for sort keys%h
This code needs to be in a file to run. And it needs -sn
flags, as well as -M5.010
(free).
For instance :
$ cat input.txt
hello world
apple pie
zebra
dog
hello world
apple pie
apple pie
apple pie
$ cat prog.pl
$h{$_}++}{$v=$h{$_},print$c?"$v $_":($D||$d)&&$v>1||$u&&$v<2?$_ x($d?1:$v):''for sort keys%h
$ perl -M5.010 -sn prog.pl -c < input.txt
4 apple pie
1 dog
2 hello world
1 zebra