How to parse a file to extract 3 digits numbers kept in a "group number"

awk '
    $1 == "Group" {printf("\\section{%s%d}\n", $1, $2); next}
    {for (i=1; i<=NF; i++) 
        if ($i ~ /^[0-9][0-9][0-9]$/) {
            printf("\\Testdetails{%d}\n", $i)
            break
        }
    }
'

Update based on comment:

Click to copy

awk '
    $1 == "Group" {printf("\\section{%s %d}\n", $1, $2); next}
    {
      title = sep = ""
      for (i=1; i<=NF; i++) 
        if ($i ~ /^[0-9][0-9][0-9]$/) {
          printf("\\subsection{%s} \\Testdetails{%d}\n", title, $i)
          break
        }
        else {
          title = title sep $i
          sep = FS
        }
    }
'

One way with perl using regexp and assuming infile has the content you posted in the question.

Content of script.pl:

Click to copy

use warnings;
use strict;

while ( <> ) { 
    chomp;
    if ( m/\A\s*(Group)\s*(\d+)/ ) { 
        printf qq[\\Section{%s}\n], $1 . $2; 
        next;
    }   

    if ( m/\s(\d{3})(?:\s|$)/ ) { 
        printf qq[\\Testdetails{%s}\n], $1; 
    }   
}

Run it like:

Click to copy

perl script.pl infile

With following output:

Click to copy

\Section{Group0}                                      
\Testdetails{101}                                      
\Testdetails{102}                                      
\Testdetails{412}                                      
\Testdetails{206}                                      
\Testdetails{207}                                      
\Testdetails{201}                                      
\Testdetails{202}                                     
\Testdetails{408}                                      
\Testdetails{101}                                      
\Section{Group1}                                      
\Testdetails{305}                                     
\Testdetails{101}                                     
\Testdetails{324}                                     
\Testdetails{206}                                      
\Testdetails{207}                                        
\Testdetails{410}
\Testdetails{409}
\Testdetails{420}
\Testdetails{426}
\Testdetails{101}
\Section{Group2}
\Testdetails{409}
\Testdetails{305}

For completeness here is a sed version:

Click to copy

sed -n -e 's#^ *Group \([0-9]\+\).*#\\Section{Group\1}#p' \
       -e 's#.*\b\([0-9][0-9][0-9]\)\b.*#\\Testdetails{\1}#p'

How to parse a file to extract 3 digits numbers kept in a "group number"

Tags:

Awk

Sed

Latex

Related

Recent Posts