Regex. Camel case to underscore. Ignore first occurrence
// (Preceded by a lowercase character or digit) (a capital) => The character prefixed with an underscore
var result = Regex.Replace(input, "(?<=[a-z0-9])[A-Z]", m => "_" + m.Value);
result = result.ToLowerInvariant();
- This works for both
PascalCase
andcamelCase
. - It creates no leading or trailing underscores.
- It leaves in tact any sequences of non-word characters and underscores in the string, because they would seem intentional, e.g.
__HiThere_Guys
becomes__hi_there_guys
. - Digit suffixes are (intentionally) considered part of the word, e.g.
NewVersion3
becomesnew_version3
. - Digit prefixes follow the original casing, e.g.
3VersionsHere
becomes3_versions_here
, but3rdVersion
becomes3rd_version
. - Unfortunately, capitalized two-letter acronyms (e.g. in
IDNumber
, whereID
would be considered a separate word), as suggested in Microsoft's Capitalization Conventions, are not supported, since they conflict with other cases. I recommend, in general, to resist this guideline, as it is a seemingly arbitrary exception to the convention of not capitalizing acronyms. Stick withIdNumber
.
You can use a lookbehind to ensure that each match is preceded by at least one character:
System.Text.RegularExpressions.Regex.Replace(input, "(?<=.)([A-Z])", "_$0",
System.Text.RegularExpressions.RegexOptions.Compiled);
lookaheads and lookbehinds allow you to make assertions about the text surrounding a match without including that text within the match.
Non-Regex solution
string result = string.Concat(input.Select((x,i) => i > 0 && char.IsUpper(x) ? "_" + x.ToString() : x.ToString()));
Seems to be quite fast too: Regex: 2569ms, C#: 1489ms
Stopwatch stp = new Stopwatch();
stp.Start();
for (int i = 0; i < 1000000; i++)
{
string input = "ThisIsMySample";
string result = System.Text.RegularExpressions.Regex.Replace(input, "(?<=.)([A-Z])", "_$0",
System.Text.RegularExpressions.RegexOptions.Compiled);
}
stp.Stop();
MessageBox.Show(stp.ElapsedMilliseconds.ToString());
// Result 2569ms
Stopwatch stp2 = new Stopwatch();
stp2.Start();
for (int i = 0; i < 1000000; i++)
{
string input = "ThisIsMySample";
string result = string.Concat(input.Select((x, j) => j > 0 && char.IsUpper(x) ? "_" + x.ToString() : x.ToString()));
}
stp2.Stop();
MessageBox.Show(stp2.ElapsedMilliseconds.ToString());
// Result: 1489ms
Maybe like;
var str = Regex.Replace(input, "([A-Z])", "_$0", RegexOptions.Compiled);
if(str.StartsWith("_"))
str = str.SubString(1);