Normalize newlines in C#
I believe this will do what you need:
using System.Text.RegularExpressions;
// ...
string normalized = Regex.Replace(originalString, @"\r\n|\n\r|\n|\r", "\r\n");
I'm not 100% sure on the exact syntax, and I don't have a .Net compiler handy to check. I wrote it in perl, and converted it into (hopefully correct) C#. The only real trick is to match "\r\n" and "\n\r" first.
To apply it to an entire stream, just run in on chunks of input. (You could do this with a stream wrapper if you want.)
The original perl:
$str =~ s/\r\n|\n\r|\n|\r/\r\n/g;
The test results:
[bash$] ./test.pl
\r -> \r\n
\n -> \r\n
\n\n -> \r\n\r\n
\n\r -> \r\n
\r\n -> \r\n
\r\n\n -> \r\n\r\n
Update: Now converts \n\r to \r\n, though I wouldn't call that normalization.
I'm with Jamie Zawinski on RegEx:
"Some people, when confronted with a problem, think "I know, I’ll use regular expressions." Now they have two problems"
For those of us who prefer readability:
Step 1
Replace \r\n by \n
Replace \n\r by \n (if you really want this, some posters seem to think not)
Replace \r by \n
Step 2 Replace \n by Environment.NewLine or \r\n or whatever.
It's a two step process.
First you convert all the combinations of \r
and \n
into a single one, say \r
Then you convert all the \r
into your target \r\n
normalized =
original.Replace("\r\n", "\r").
Replace("\n\r", "\r").
Replace("\n", "\r").
Replace("\r", "\r\n"); // last step