Normalize newlines in C#

I believe this will do what you need:

using System.Text.RegularExpressions;
// ...
string normalized = Regex.Replace(originalString, @"\r\n|\n\r|\n|\r", "\r\n");

I'm not 100% sure on the exact syntax, and I don't have a .Net compiler handy to check. I wrote it in perl, and converted it into (hopefully correct) C#. The only real trick is to match "\r\n" and "\n\r" first.

To apply it to an entire stream, just run in on chunks of input. (You could do this with a stream wrapper if you want.)


The original perl:

$str =~ s/\r\n|\n\r|\n|\r/\r\n/g;

The test results:

[bash$] ./test.pl
\r -> \r\n
\n -> \r\n
\n\n -> \r\n\r\n
\n\r -> \r\n
\r\n -> \r\n
\r\n\n -> \r\n\r\n

Update: Now converts \n\r to \r\n, though I wouldn't call that normalization.


I'm with Jamie Zawinski on RegEx:

"Some people, when confronted with a problem, think "I know, I’ll use regular expressions." Now they have two problems"

For those of us who prefer readability:

  • Step 1

    Replace \r\n by \n

    Replace \n\r by \n (if you really want this, some posters seem to think not)

    Replace \r by \n

  • Step 2 Replace \n by Environment.NewLine or \r\n or whatever.


It's a two step process.
First you convert all the combinations of \r and \n into a single one, say \r
Then you convert all the \r into your target \r\n

normalized = 
    original.Replace("\r\n", "\r").
             Replace("\n\r", "\r").
             Replace("\n", "\r").
             Replace("\r", "\r\n"); // last step

Tags:

C#

.Net