Split line into key-value pairs based on first string
Here goes
awk '{for (i=2; i<=NF; ++i) print $1, $i}' file
A B
A C
1 2
1 3
1 4
printf %s\\n 'A B C' '1 2 3 4'|
sed -e's/\([^ ]*\) *[^ ]*/&\n\1/;//P;D'
A B
A C
1 2
1 3
1 4
That works. It selects the first two sequences of zero or more not-space characters which are separated by one or more spaces. The first such sequence is referenced in \1
and the whole selection in &
. The selection is replaced with itself followed by a \n
ewline then \1
. Pattern space is then printed up to the first occurring newline, and then the same portion is D
eleted before the pattern space is recycled to the top of the script with what remains.
You can see what it does with the l
ook command. Replace the P
w/ l
and put another l
before the s///
ubstitution...
A B C$
A B\nA C$
A C$
A C\nA$
A$
1 2 3 4$
1 2\n1 3 4$
1 3 4$
1 3\n1 4$
1 4$
1 4\n1$
1$
printf %s\\n 'A B C' '1 2 3 4'|
sed -ne:t -e'/ *[^ ]*/{s//\n&/2;P;s///;} -ett
A B
A C
1 2
1 3
1 4
It matches a pattern space with at least one sequence of space characters and any trailing not-spaces. The first substitution inserts a newline before the second occurrence of such a sequence, then P
rints up to the newline, and the second substitution removes the first occurrence of that pattern - which will also now include the newline the first one appended to the tail of that sequence when operating on the second. The t
est branches back to the :t
label each time a substitution occurs, and so sed
eats pattern space a space separated field at a time.
With l
ook again:
A B C$
A B\n C$
A C$
A C$
1 2 3 4$
1 2\n 3 4$
1 3 4$
1 3\n 4$
1 4$
1 4$