How to count occurrences of each character?
You could use this:
sed 's/./&\n/g' 1.txt | sort | uniq -ic
4
5 a
1 c
1 k
1 M
1 n
5 o
2 s
4 t
2 w
1 y
The sed
part places a newline after every character. Then we sort
the ouput alphabetically. And at last uniq
counts the number of occurences. The -i
flag of uniq
can be ommited if you don't want case insensitivity.
A bit late, but to complete the set, another python(3) approach, sorted result:
#!/usr/bin/env python3
import sys
chars = open(sys.argv[1]).read().strip().replace("\n", "")
[print(c+" -", chars.count(c)) for c in sorted(set([c for c in chars]))]
A - 1
M - 1
O - 1
T - 1
a - 4
c - 1
k - 1
n - 1
o - 4
s - 2
t - 3
w - 2
y - 1
Explanation
Read the file, skip spaces and returns as "characters":
chars = open(sys.argv[1]).read().strip().replace("\n", "")
Create a (sorted) set of uniques:
sorted(set([c for c in chars]))
Count and print the occurrence for each of the characters:
print(c+" -", chars.count(c)) for c in <uniques>
How to use
- Paste the code into an empty file, save it as
chars_count.py
Run it with the file as an argument by either:
/path/to/chars_count.py </path/to/file>
if the script is executable, or:
python3 /path/to/chars_count.py </path/to/file>
if it isn't
By default in awk the Field Separator (FS) is space or tab. Since we want to count each character, we will have to redefine the FS to nothing(FS=""
) to split each character in separate line and save it into an array and at the end insideEND{..}
block, print their total occurrences by the following awk command:
$ awk '{for (i=1;i<=NF;i++) a[$i]++} END{for (c in a) print c,a[c]}' FS="" file
A 1
M 1
O 1
T 1
a 4
c 1
k 1
n 1
o 4
s 2
t 3
w 2
y 1
In {for (i=1;i<=NF;i++) a[$i]++} ... FS="" ...
block we just splits the characters. And
in END{for (c in a) print c,a[c]}
block we are looping to array a
and printing saved character in it print c
and its number of occurrences a[c]