How to capitalize the first letter of each word in a string
From the documentation, though IMHO not easy to find:
StringReplace["this is a test", WordBoundary ~~ x_ :> ToUpperCase[x]]
"This Is A Test"
István Zachar highlighted a problem with WordBoundary
that I'm still trying to understand. Nevertheless it seems that one can use:
strAcc = "árv ízt űr őt ük örf úr óg ép";
StringReplace[strAcc, z : (StartOfString | WhitespaceCharacter ~~ _) :> ToUpperCase[z]]
"Árv Ízt Űr Őt Ük Örf Úr Óg Ép"
It appears that the PCRE library at least as used by Mathematica does not recognize certain characters as letters. A few examples:
StringReplace[strAcc, z : RegularExpression["(?:\\A|\\s)."] :> ToUpperCase[z]]
StringReplace[strAcc, z : RegularExpression["\\b."] :> ToUpperCase[z]]
"Árv Ízt Űr Őt Ük Örf Úr Óg Ép" "Árv Ízt űR őT Ük Örf Úr Óg Ép" (* note odd handling *)
StringCases["abcőű", RegularExpression["\\w"]]
{"a", "b", "c"} (* ő and ű missing *)
Actually, WordBoundary
won't always work correctly (see this thread):
str = "the lazy dog jumped over the quick brown fox.";
strAcc = "árv ízt űr őt ük örf úr óg ép";
StringReplace[str, WordBoundary ~~ x_ :> ToUpperCase[x]]
StringReplace[strAcc, WordBoundary ~~ x_ :> ToUpperCase[x]]
"The Lazy Dog Jumped Over The Quick Brown Fox." "Árv Ízt űR őT Ük Örf Úr Óg Ép" (* note ű,ő instead of Ű,Ő *)
Use instead this custom made toTitleCase
:
toTitleCase[str__] := StringJoin@Riffle[ToUpperCase@StringTake[#, 1] <>
ToLowerCase@StringTake[#, {2, -1}] & /@ StringSplit@StringJoin@str, " "];
toTitleCase[str]
toTitleCase[strAcc]
"The Lazy Dog Jumped On The Quick Brown Fox." "Árv Ízt Űr Őt Ük Örf Úr Óg Ép"
In version 10.1, this is built in as ToTitleCase
(if I recall correctly, it was an experimental function; documentation is no longer accessible on the net). This version removes all non-alphanumeric characters from the string; I assume this is a bug.
ToTitleCase[str]
ToTitleCase[strAcc]
"The Lazy Dog Jumped On The Quick Brown Fox" "Árv Ízt Űr Őt Ük Örf Úr Óg Ép"
ToTitleCase
has an eventful development history: in version 11, it is removed from among the built-ins, but is still acccessible from the "GeneralUtilities`"
context. It does not remove non-alphanumeric characters anymore from the string.
GeneralUtilities`ToTitleCase[str]
GeneralUtilities`ToTitleCase[strAcc]
"The Lazy Dog Jumped Over the Quick Brown Fox." "Árv Ízt Űr Őt Ük Örf Úr Óg Ép"
While I endorse Mr.Wizard's pattern matching solution and I've given Istvàn +1 I would also like to submit this function which is meant to not rely on string patterns and be as readable as possible:
toTitleCase[str_] := StringJoin[
MapAt[
ToUpperCase, Characters[str],
Position[Characters[" " <> StringTrim@str], " "]
]
]
toTitleCase["the lazy dog jumped over the quick brown fox"]
"The Lazy Dog Jumped Over The Quick Brown Fox"