ETAOIN SHRDLU golf
Python 2 or 3 - 77 75 bytes
f=lambda s:''.join(sorted(map(chr,range(65,91)),key=s.upper().count))[::-1]
I had an answer before that grabbed input from STDIN, but I realized it was technically invalid. I used input()
which gets only a single line, but the question's example input implies that it should handle multiple lines at once. To meet spec, I turned my answer into a function that takes a string argument. To my surprise, it was two bytes smaller! It didn't occur to me that print(...)
and input()
were longer than f=lambda s:
and s
.
This also makes the answer compatible with both Python 2 and Python 3. Originally it was only Python 3, because it used input()
(which was called raw_input()
in 2). Now that it's a function, it works in both.
Explained
range(65,91) # The numbers 65 to 90
map(chr,range(65,91)) # Convert to ASCII
s # The input string
s.upper() # Convert to uppercase
s.upper().count # Function literal for 'how many times the argument appears in the string'
sorted(map(chr,range(65,91)),key=s.upper().count) # Sort by that function
''.join(sorted(map(chr,range(65,91)),key=s.upper().count)) # Concatenate to string
''.join(sorted(map(chr,range(65,91)),key=s.upper().count))[::-1] # Step through by -1 (i.e. reverse string)
lambda s:''.join(sorted(map(chr,range(65,91)),key=s.upper().count))[::-1] # Make it a function (`return` is implicit for lambdas)
f=lambda s:''.join(sorted(map(chr,range(65,91)),key=s.upper().count))[::-1] # Give it a name
CJam, 21 19 bytes
qeu:A;'[,65>{A\-,}$
Try it online.
Example
$ cjam etaoin.cjam <<< "~XyxY YyxZ"
YXZABCDEFGHIJKLMNOPQRSTUVW
(no newline)
How it works
qeu:A; " Read from STDIN, convert to uppercase, save in the variable “A” and discard, ";
'[, " Push an array of all ASCII characters before “[” (NUL to “Z”). ";
65> " Remove the first 64 characters (NUL to “@”). ";
{ " Sort the array of characters by the following mapping: ";
A\ " Swap the character with the string saved in variable “A”. ";
- " Remove all occurrences of the character from the string. ";
, " Push the length of the string. ";
}$ " ";
More occurrences means more characters get removed, so the most frequent characters appear at the beginning of the array.
Bash, 65 bytes
(tr a-z A-Z;echo {A..Z})|fold -1|sort|uniq -c|sort -nr|tr -dc A-Z
Example
$ bash etaoin.sh <<< "~AbaB BbaC"
BACZYXWVUTSRQPONMLKJIHGFED
How it works
( #
tr a-z A-Z # Turn lowercase into uppercase letters.
echo {A..Z} # Print all uppercase letters.
) | #
fold -1 | # Split into lines of length 1.
sort | # Sort those lines (required for piping to uniq).
uniq -c | # Print the frequencies of all lines.
sort -nr | # Sort by frequency (reversed).
tr -dc A-Z # Remove everything that's not an uppercase letter.