How to run the ‘:sort u’ command in Vim on a CSV table, but only use the values in a particular column as sorting keys?
Since it is not possible to achieve the transformation in question in
one run of the :sort
command, let us approach it as a two-step process.
1. The first step is sorting lines by the values of the second column
(separated from the first one by a comma). In order to do that, we can
use the :sort
command, passing a regular expression that matches the
first column and the following comma:
:sort/^[^,]*,/
As :sort
compares the text starting just after the match of the
specified pattern on each line, it gives us the desired sorting
behavior. To compare the values numerically rather than
lexicographically, use the n
flag:
:sort n/^[^,]*,/
2. The second step involves running through the sorted lines and removing
all lines but one in every block of consecutive lines with the same
value in the second column. It is convenient to build our implementation
upon the :global
command, which executes a given Ex command on every
line matching a certain pattern. For our purposes, a line can be
deleted if it contains the same value in the second column as the
following line. This formalization—accompanied with the initial
assumption that commas cannot occur within column values—gives us
the following pattern:
^[^,]*,\([^,]*\),.*\n[^,]*,\1,.*
If we run the :delete
command on every line that satisfies this
pattern, going from top to bottom over them in sorted order, we will
have only a single line for every distinct value in the second column:
:g/^[^,]*,\([^,]*\),.*\n[^,]*,\1,.*/d_
3. Finally, both of the steps can be combined in a single Ex command:
:sort/^[^,]*,/|g/^[^,]*,\([^,]*\),.*\n[^,]*,\1,.*/d_