repeating vector of letters
Working solution
A function to produce Excel-style column names, i.e.
# A, B, ..., Z, AA, AB, ..., AZ, BA, BB, ..., ..., ZZ, AAA, ...
letterwrap <- function(n, depth = 1) {
args <- lapply(1:depth, FUN = function(x) return(LETTERS))
x <- do.call(expand.grid, args = list(args, stringsAsFactors = F))
x <- x[, rev(names(x)), drop = F]
x <- do.call(paste0, x)
if (n <= length(x)) return(x[1:n])
return(c(x, letterwrap(n - length(x), depth = depth + 1)))
}
letterwrap(26^2 + 52) # through AAZ
Botched attempt
Initially I thought this would best be done cleverly by converting to base 26, but that doesn't work. The issue is that Excel column names aren't base 26, which took me a long time to realize. The catch is 0: if you try to map a letter (like A
) to 0, you've got a problem when you want to distinguish between A
and AA
and AAA
...
Another way to illustrate the problem is in "digits". In base 10, there are 10 single-digit numbers (0-9), then 90 double-digit numbers (10:99), 900 three-digit numbers... generalizing to 10^d - 10^(d - 1)
numbers with d
digits for d > 1
. However, in Excel column names there are 26 single-letter names, 26^2 double-letter names, 26^3 triple-letter names, with no subtraction.
I'll leave this code as a warning to others:
## Converts a number to base 26, returns a vector for each "digit"
b26 <- function(n) {
stopifnot(n >= 0)
if (n <= 1) return(n)
n26 <- rep(NA, ceiling(log(n, base = 26)))
for (i in seq_along(n26)) {
n26[i] <- (n %% 26)
n <- n %/% 26
}
return(rev(n26))
}
## Returns the name of nth value in the sequence
## A, B, C, ..., Z, AA, AB, AC, ..., AZ, BA, ...
letterwrap1 <- function(n, lower = FALSE) {
let <- if (lower) letters else LETTERS
base26 <- b26(n)
base26[base26 == 0] <- 26
paste(let[base26], collapse = "")
}
## Vectorized version of letterwrap
letter_col_names <- Vectorize(letterwrap, vectorize.args="n")
> letter_col_names(1:4)
[1] "A" "B" "C" "D"
> letter_col_names(25:30)
[1] "Y" "Z" "AA" "AB" "AC" "AD"
# Looks pretty good
# Until we get here:
> letter_col_names(50:54)
[1] "AX" "AY" "BZ" "BA" "BB"
It's not too difficult to piece together a quick function to do something like this:
myLetters <- function(length.out) {
a <- rep(letters, length.out = length.out)
grp <- cumsum(a == "a")
vapply(seq_along(a),
function(x) paste(rep(a[x], grp[x]), collapse = ""),
character(1L))
}
myLetters(60)
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l"
# [13] "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x"
# [25] "y" "z" "aa" "bb" "cc" "dd" "ee" "ff" "gg" "hh" "ii" "jj"
# [37] "kk" "ll" "mm" "nn" "oo" "pp" "qq" "rr" "ss" "tt" "uu" "vv"
# [49] "ww" "xx" "yy" "zz" "aaa" "bbb" "ccc" "ddd" "eee" "fff" "ggg" "hhh"
If you just want unique names, you could use
make.unique(rep(letters, length.out = 30), sep='')
Edit:
Here's another way to get repeating letters using Reduce
.
myletters <- function(n)
unlist(Reduce(paste0,
replicate(n %/% length(letters), letters, simplify=FALSE),
init=letters,
accumulate=TRUE))[1:n]
myletters(60)
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l"
# [13] "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x"
# [25] "y" "z" "aa" "bb" "cc" "dd" "ee" "ff" "gg" "hh" "ii" "jj"
# [37] "kk" "ll" "mm" "nn" "oo" "pp" "qq" "rr" "ss" "tt" "uu" "vv"
# [49] "ww" "xx" "yy" "zz" "aaa" "bbb" "ccc" "ddd" "eee" "fff" "ggg" "hhh"
There is almost certainly a better way, but this is what I ended up with:
letter_wrap <- function(idx) {
vapply(
idx,
function(x)
paste0(
rep(
letters[replace(x %% 26, !x %% 26, 26)], 1 + (x - 1) %/% 26 ), collapse=""), "")
}
letter_wrap(1:60)
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n"
# [15] "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" "aa" "bb"
# [29] "cc" "dd" "ee" "ff" "gg" "hh" "ii" "jj" "kk" "ll" "mm" "nn" "oo" "pp"
# [43] "qq" "rr" "ss" "tt" "uu" "vv" "ww" "xx" "yy" "zz" "aaa" "bbb" "ccc" "ddd"
# [57] "eee" "fff" "ggg" "hhh"
EDIT: failed to notice Ananda's answer before I posted this one. This one is different enough that I'm leaving it. Note it takes the index vector as an input, as opposed to the number of items.