How to add some symbol (or just add newline) if the numbers in the text are not continuous
Using grep
or sed
for doing this would not be recommended as grep
can't count and sed
is really difficult to do any kind of arithmetics in (it would have to be regular expression-based counting, a non-starter for most people except for the dedicated).
$ awk -F '[<>]' '{ while ($2 >= ++nr) print "---"; print }' file
A<0>
A<1>
A_D2<2>
A_D2<3>
A<4>
---
A_D2<6>
---
---
A<9>
A_D2<10>
---
---
A<13>
The awk
code assumes that 0
should be the first number, and then maintains the wanted line number for the current line in the variable nr
. If a number is read from the input that requires one or several lines to be inserted, this is done by the while
loop (which also increments the nr
variable).
The number in <...>
is parsed out by specifying that <
and >
should be used as field delimiters. The number is then in $2
(the 2nd field).
This is probably far from efficient...
$ tr '<' '\t' < testfile | tr '>' ' ' \
| awk '{ while (NR + shift <= $2) { print "-----"; shift++ }; print }' \
| tr '\t' '<' \
| tr ' ' '>'
A<0>
A<1>
A_D2<2>
A_D2<3>
A<4>
-----
A_D2<6>
-----
-----
A<9>
A_D2<10>
-----
-----
A<13>
First, I use tr
to get two tab-separated fields from the file.
Second, I use tr
again to replace '>' with a space, because otherwise my awk command will fail :-/
The awk-professionals around here will likely laugh now :-)
Third, the awk
-command will compare the number of rows processed to the second field. If the number of rows is smaller, it will print the marker and increase shift
which is added to the number of rows in the previous comparison.
Fourth and fifth: I'm undoing the changes I previously made with tr
.
I got some inspiration from https://unix.stackexchange.com/a/190707/364705
I'm not an awk
guy, but this also seems to do it. I'm always open to improvements:
awk -F '[<>]' -v num=0 '
{
while(num < $2) {
print "----";
num++
}
print $1"<"$2">"
num++
}' file
At first we set the field separator to match the characters <
and >
, so each line is split at these characters.
For example the first line would be assigned to $1=A
and $2=0
.
Then we set variable num=0
. We use it as the line counter:
If the number of the current line $2
is greater than the line counter, print ----
, increment the counter repeat until both values are equal. Then print $1<$2>
and increment the counter.