How do I split a string into an array of characters?
You can split on an empty string:
var chars = "overpopulation".split('');
If you just want to access a string in an array-like fashion, you can do that without split
:
var s = "overpopulation";
for (var i = 0; i < s.length; i++) {
console.log(s.charAt(i));
}
You can also access each character with its index using normal array syntax. Note, however, that strings are immutable, which means you can't set the value of a character using this method, and that it isn't supported by IE7 (if that still matters to you).
var s = "overpopulation";
console.log(s[3]); // logs 'r'
Old question but I should warn:
Do NOT use .split('')
You'll get weird results with non-BMP (non-Basic-Multilingual-Plane) character sets.
Reason is that methods like .split()
and .charCodeAt()
only respect the characters with a code point below 65536; bec. higher code points are represented by a pair of (lower valued) "surrogate" pseudo-characters.
'ððð'.length // —> 6
'ððð'.split('') // —> ["�", "�", "�", "�", "�", "�"]
'ð'.length // —> 2
'ð'.split('') // —> ["�", "�"]
Use ES2015 (ES6) features where possible:
Using the spread operator:
let arr = [...str];
Or Array.from
let arr = Array.from(str);
Or split
with the new u
RegExp flag:
let arr = str.split(/(?!$)/u);
Examples:
[...'ððð'] // —> ["ð", "ð", "ð"]
[...'ððð'] // —> ["ð", "ð", "ð"]
For ES5, options are limited:
I came up with this function that internally uses MDN example to get the correct code point of each character.
function stringToArray() {
var i = 0,
arr = [],
codePoint;
while (!isNaN(codePoint = knownCharCodeAt(str, i))) {
arr.push(String.fromCodePoint(codePoint));
i++;
}
return arr;
}
This requires knownCharCodeAt()
function and for some browsers; a String.fromCodePoint()
polyfill.
if (!String.fromCodePoint) {
// ES6 Unicode Shims 0.1 , © 2012 Steven Levithan , MIT License
String.fromCodePoint = function fromCodePoint () {
var chars = [], point, offset, units, i;
for (i = 0; i < arguments.length; ++i) {
point = arguments[i];
offset = point - 0x10000;
units = point > 0xFFFF ? [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)] : [point];
chars.push(String.fromCharCode.apply(null, units));
}
return chars.join("");
}
}
Examples:
stringToArray('ððð') // —> ["ð", "ð", "ð"]
stringToArray('ððð') // —> ["ð", "ð", "ð"]
Note: str[index]
(ES5) and str.charAt(index)
will also return weird results with non-BMP charsets. e.g. 'ð'.charAt(0)
returns "�"
.
UPDATE: Read this nice article about JS and unicode.