Typoglycemia text manipulation
My attempt:
SeedRandom[42];
StringReplace[text,
a : (Repeated[LetterCharacter, {5, ∞}]) :>
StringJoin[Take[#, 2], RandomSample[#[[3 ;; -3]]],
Take[#, -2]] &[Characters[a]]]
"In a puciibltaon of New Scieintst you could radmonise all the \
letters,keiepng the first two and last two the same,and reabdaiilty \
would hardly be affected.My anlaysis did not come to much beaucse the \
theory at the time was for shape and seqneuce recgotniion.Saebri's \
work sueggsts we may have some pofewrul palrlael prsoecosrs at \
work.The reason for this is surely that ideyfniitng content by \
parallel prisesocng speeds up regioticnon.We only need the first and \
last two letters to spot chnages in meaning."
scramble[s_String /; StringLength[s] > 5] :=
#[[;; 2]] <> RandomSample @ #[[3 ;; -3]] <> #[[-2 ;;]] & @ Characters @ s
scramble[else_] := else
MapAt[scramble, StringSplit[text, WordBoundary], ;; ;; 2] // StringJoin
"In a puilicbaton of New Scniteist you could raniodmse all the leettrs,kepeing the first two and last two the same,and reiialbadty would hadrly be afeftced.My anlyasis did not come to much beaucse the thoery at the time was for shape and seueqnce regiitnocon.Saberi's work sugegsts we may have some powrfeul paarllel prsoseocrs at work.The reason for this is suerly that idnytifeing conetnt by pallarel proissecng speeds up reigncitoon.We only need the first and last two letters to spot chngaes in meianng."
This assumes that your
text
starts with a letter.I went with the first two and last two characters being fixed, as described in the text.
Update: after seeing J. M.'s answer I suppose I could better have written:
StringReplace[text, x : LetterCharacter .. :> scramble[x]]
As a regular expression:
SeedRandom[42];
regex = RegularExpression["\\b([[:alpha:]]{2})([[:alpha:]]+)([[:alpha:]]{2})\\b"];
StringReplace[text, regex :> StringJoin["$1", RandomSample@Characters@"$2", "$3"]
"In a puiabtlicon of New Sceitinst you could raondimse all the letetrs,keeping the first two and last two the same,and reaailidbty would hardly be afcfteed.My anayslis did not come to much because the theory at the time was for shape and seeqnuce recgitinoon.Saberi's work suseggts we may have some pofwreul paallrel proseocsrs at work.The resaon for this is suerly that idieniyftng content by parlalel preioscsng speeds up regoticnion.We only need the first and last two leettrs to spot chgnaes in meaning."
Generalizing to "fix" the first and last n characters is trivial:
regex@n_ := ToString@StringForm[
"\\b([[:alpha:]]{`1`})([[:alpha:]]+)([[:alpha:]]{`1`})\\b", n];
StringReplace[text, RegularExpression@regex@2 :>
StringJoin["$1", RandomSample@Characters@"$2", "$3"]]
For those not acquainted with regexes, here is a breakdown:
wordBoundary = "\\b";
twoLetters = "([[:alpha:]]{2})";
oneOrMore = "([[:alpha:]]+)";
regex = wordBoundary ~~ twoLetters ~~ oneOrMore ~~ twoLetters ~~ wordBoundary;
Please note that further generalization to "a string with minimum length m
, keeping the first and last n
chars fixed is also straightforward using the repetition pattern {min,}
:
regex[m_, n_] :=
ToString@StringForm["\\b([[:alpha:]]{`1`})([[:alpha:]]{`2`,})([[:alpha:]]{`1`})\\b",
n, m - 2 n] /; m > 2 n
StringReplace[text, RegularExpression@regex[5, 1] :>
StringJoin["$1", RandomSample@Characters@"$2", "$3"]]
"In a piulcotaibn of New Ssitneict you colud rndsaomie all the lterets,kepnieg the fisrt two and last two the same,and redbialtaiy wuold hadrly be aefecftd.My anilyass did not come to much bsceuae the tehory at the time was for sahpe and snecquee rcntieooign.Saebri's work setgguss we may have some puwfroel pelarlal pscorseors at work.The rosaen for this is seulry that iyindnietfg cntoent by parlelal prniesscog seepds up reigooncitn.We only need the first and last two lerttes to spot caegnhs in meniang."