RegEx string formatting in Notepad++
Section A: Pad to specific length
To right-pad lines with N characters using regular expressions, add N spaces to the end of the line, then group the first N characters replacing the rest.
Pass 1: Add padding characters
Find: $
Replace: ______________________________
At the end of the line add 30 spaces. (I used underscores since spaces wouldn't format on the post).
Pass 2: Trim left 30 characters
To pad a dash-delimited date at the beginning of a line, match each section accordingly.
Find: ^([[:print:]]{0,30}).*$
Replace with \1
At the beginning of the line, replace a group up to thirty printable characters followed by any remaining characters with the group.
To pick a different line-length, use n-spaces in Pass 1 then replace 30 with the length in Pass 2.
Section B: Line starting with date
Pass 1 (day of month):
Find what: ^([0-9])-
Replace with: 0\1-
Replace the pattern (line starting with a single digit followed by a dash) with the padded zero, the digit, and the dash.
Pass 2 (month):
Find what: -([0-9])-
Replace with: -0\1-
Replace the pattern (a single digit between two dashes) with a dash, the padded zero, the digit, and the dash.
In respond to:
12345678 TXT 19700101 0 100 20160624 100 Comment text 12345678 TXT 19700101 100 100,25 20160624 0,25 Comment text 12345678 TXT 19700101 100,25 100,5 20160624 0,25 Comment text
For 4th column:
^((?:\S+\s+){3}\d+)(\s)
to \1,0\2
^((?:\S+\s+){3}\d+,\d)(\s)
to \10\2
For 5th/7th column:
similar to above, just replace {3}
with {4}
/{6}
in the rule respectively
Explanation
The 1st rule appends ,0
to numbers without ,
. Now all numbers must have ,\d
.
The 2nd rule appends a 0
to those with single digit after comma.
As for (?:)
:non-capture group, the previous columns are already captured as \1
so additional capturing is unnecessary.
This only pads number to 2 decimal places. To pad an arbitrary amount, use the pad excessively, then trim
approach.
Final word?
In my opinion, plain regex as in notepad++ is inadequate for this task. Some basic scripting like bash or perl would have handled this with much higher readability.