There must be a better way to replace single newlines only?
You can use awk like this:
$ awk ' /^$/ { print; } /./ { printf("%s ", $0); } ' test
Or if you need an extra newline at the end:
$ awk ' /^$/ { print; } /./ { printf("%s ", $0); } END { print ""; } ' test
Or if you want to separate the paragraphs by a newline:
$ awk ' /^$/ { print "\n"; } /./ { printf("%s ", $0); } END { print ""; } ' test
These awk commands make use of actions that are guarded by patterns:
/regex/
or
END
A following action is only executed if the pattern matches the current line.
And the ^$.
characters have special meaning in regular expressions, where ^
matches the beginning of line, $
the end and .
an arbitrary character.
Use Awk or Perl's paragraph mode to process a file paragraph by paragraph, where paragraphs are separated by blank lines.
awk -vRS= '
NR!=1 {print ""} # print blank line before every record but the first
{ # do this for every record (i.e. paragraph):
gsub(" *\n *"," "); # replace newlines by spaces, compressing spaces
sub(" *$",""); # remove spaces at the end of the paragraph
print
}
'
perl -000 -pe ' # for every paragraph:
print "\n" unless $.==1; # print a blank line, except before the first paragraph
s/ *\n *(?!$)/ /g; # replace newlines by spaces, compressing spaces, but not at the end of the paragraph
s/ *\n+\z/\n/ # normalize the last line end of the paragraph
'
Of course, since this doesn't parse the (La)TeX, it will horribly mutilate comments, verbatim environments and other special-syntax. You may want to look into DeTeX or other (La)TeX-to-text converters.
(reviving an ancient question)
This seems to be exactly what fmt
and par
are for - paragraph reformatting. Like you (and also like many programs) they define paragraph boundaries as one (or more) blank lines. Try piping your text through one of these.
fmt
is a standard unix utility and can be found in GNU Coreutils.
par
is a greatly-enhanced fmt
written by Adam M. Costello which can be found at http://www.nicemice.net/par/ (it has also been packaged for several distributions, including debian - I packaged it for debian in Jan 1996, although there's a new maintainer for the pkg now.).