Remove duplicate lines without sorting
A late answer - I just ran into a duplicate of this - but perhaps worth adding...
The principle behind @1_CR's answer can be written more concisely, using cat -n
instead of awk
to add line numbers:
cat -n file_name | sort -uk2 | sort -n | cut -f2-
- Use
cat -n
to prepend line numbers - Use
sort -u
remove duplicate data (-k2
says 'start at field 2 for sort key') - Use
sort -n
to sort by prepended number - Use
cut
to remove the line numbering (-f2-
says 'select field 2 till end')
The UNIX Bash Scripting blog suggests:
awk '!x[$0]++'
This command is telling awk which lines to print. The variable $0
holds the entire contents of a line and square brackets are array access. So, for each line of the file, the node of the array x
is incremented and the line printed if the content of that node was not (!
) previously set.