How to clean up output of linux 'script' command
If you want to view the file, then you can send the output through col -bp
; this interprets the control characters. Then you can pipe through less, if you like.
col -bp typescript | less -R
On some systems col
wouldn't accept a filename argument, use this syntax instead:
col -bp <typescript | less -R
cat typescript | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g' | col -b > typescript-processed
here's some interpretation of the string input to perl
:
s/pattern//g
means to do a substitution on the entire (theg
option means do the entire thing instead of stopping on the first substitute) input string
here's some interpretation of the regex pattern:
\e
match the special "escape" control character (ASCII 0x1A)(
and)
are the beginning and end of a group|
means the group can match one of N patterns. where the N patterns are[^\[\]]
or\[.*?[a-zA-Z]
or\].*?\a
[^\[\]]
means- match a set of NOT characters where the not characters are
[
and]
- match a set of NOT characters where the not characters are
\[.*?[a-zA-Z]
means- match a string starting with
[
then do a non-greedy.*?
until the first alpha character
- match a string starting with
\].*?\a
means- match a string that starts with
]
then do a non-greedy.*?
until you hit the special control character called "the alert (bell) character"
- match a string that starts with
For a large quantity of script
output, I'd hack a perl script together iteratively. Otherwise hand edit with a good editor.
There is unlikely to be an existing automated method of removing control characters from script
output in a way that reproduces what was displayed on the screen at certain important moments (such as when the host was waiting for that first character of some user input).
For example the screen might be blank except for Andrew $
, if you then typed rm /*
and pressed backspace twelve times (far more than needed), what gets shown on the screen at the end of that depends on what shell was running, what your current stty
settings are (which you might change partway through a session) and probably some other factors too.
The above applies to any automated method of continuously capturing input and output. The main alternative is taking "screen shots" or cutting and pasting the screen at appropriate times during the session (which is what I do for user guides, notes for a day-log, etc).