Tips for golfing in R
Some tips:
- In R, it's recommended to use
<-
over=
. For golfing, the opposite holds since=
is shorter... If you call a function more than once, it is often beneficial to define a short alias for it:
as.numeric(x)+as.numeric(y) a=as.numeric;a(x)+a(y)
Partial matching can be your friend, especially when functions return lists which you only need one item of. Compare
rle(x)$lengths
torle(x)$l
Many challenges require you to read input.
scan
is often a good fit for this (the user ends the input by entring an empty line).scan() # reads numbers into a vector scan(,'') # reads strings into a vector
Coercion can be useful.
t=1
is much shorter thant=TRUE
. Alternatively,switch
can save you precious characters as well, but you'll want to use 1,2 rather than 0,1.if(length(x)) {} # TRUE if length != 0 sum(x<3) # Adds all the TRUE:s (count TRUE)
If a function computes something complicated and you need various other types of calculations based on the same core value, it is often beneficial to either: a) break it up into smaller functions, b) return all the results you need as a list, or c) have it return different types of values depending on an argument to the function.
As in any language, know it well - R has thousands of functions, there is probably some that can solve the problem in very few characters - the trick is to know which ones!
Some obscure but useful functions:
sequence
diff
rle
embed
gl # Like rep(seq(),each=...) but returns a factor
Some built-in data sets and symbols:
letters # 'a','b','c'...
LETTERS # 'A','B','C'...
month.abb # 'Jan','Feb'...
month.name # 'January','Feburary'...
T # TRUE
F # FALSE
pi # 3.14...
Instead of importing a package with
library
, grab the variable from the package using::
. Compare the followings:library(splancs);inout(...) splancs::inout(...)
Of course, it is only valid if one single function is used from the package.
This is trivial but a rule of thumb for when to use @Tommy's trick of aliasing a function: if your function name has a length of
m
and is usedn
times, then alias only ifm*n > m+n+3
(because when defining the alias you spendm+3
and then you still spend 1 everytime the alias is used). An example:nrow(a)+nrow(b) # 4*2 < 4+3+2 n=nrow;n(a)+n(b) length(a)+length(b) # 6*2 > 6+3+2 l=length;l(a)+l(b)
Coercion as side-effect of functions:
instead of using
as.integer
, character strings can be coerced to integer using:
:as.integer("19") ("19":1)[1] #Shorter version using force coercion.
integer, numeric, etc. can be similarly coerced to character using
paste
instead ofas.character
:as.character(19) paste(19) #Shorter version using force coercion.
Some very specific golfing tips:
if you need to extract the length of a vector,
sum(x|1)
is shorter thanlength(x)
as long asx
is numeric, integer, complex or logical.if you need to extract the last element of a vector, it may be cheaper (if possible) to initialise the vector backwards using
rev()
and then callingx[1]
rather thanx[length(x)]
(or using the above tip,x[sum(x|1)]
) (ortail(x,1)
--- thanks Giuseppe!). A slight variation on this (where the second-last element was desired) can be seen here. Even if you can't initialise the vector backwards,rev(x)[1]
is still shorter thanx[sum(x|1)]
(and it works for character vectors too). Sometimes you don't even needrev
, for example usingn:1
instead of1:n
.(As seen here). If you want to coerce a data frame to a matrix, don't use
as.matrix(x)
. Take the transpose of the transpose,t(t(x))
.if
is a formal function. For example,"if"(x<y,2,3)
is shorter thanif(x<y)2 else 3
(though of course,3-(x<y)
is shorter than either). This only saves characters if you don't need an extra pair of braces to formulate it this way, which you often do.For testing non-equality of numeric objects,
if(x-y)
is shorter thanif(x!=y)
. Any nonzero numeric is regarded asTRUE
. If you are testing equality, say,if(x==y)a else b
then tryif(x-y)b else a
instead. Also see the previous point.The function
el
is useful when you need to extract an item from a list. The most common example is probablystrsplit
:el(strsplit(x,""))
is one fewer byte thanstrsplit(x,"")[[1]]
.(As used here) Vector extension can save you characters: if vector
v
has lengthn
you can assign intov[n+1]
without error. For example, if you wanted to print the first ten factorials you could do:v=1;for(i in 2:10)v[i]=v[i-1]*i
rather thanv=1:10:for(...)
(though as always, there is another, better, way:cumprod(1:10)
)Sometimes, for text based challenges (particularly 2-D ones), it's easier to
plot
the text rather thancat
it. the argumentpch=
toplot
controls which characters are plotted. This can be shortened topc=
(which will also give a warning) to save a byte. Example here.To take the floor of a number, don't use
floor(x)
. Usex%/%1
instead.To test if the elements of a numeric or integer vector are all equal, you can often use
sd
rather than something verbose such asall.equal
. If all the elements are the same, their standard deviation is zero (FALSE
) else the standard deviation is positive (TRUE
). Example here.Some functions which you would expect to require integer input actually don't. For example,
seq(3.5)
will return1 2 3
(the same is true for the:
operator). This can avoid calls tofloor
and sometimes means you can use/
instead of%/%
.The most common function for text output is
cat
. But if you needed to useprint
for some reason, then you might be able to save a character by usingshow
instead (which in most circumstances just callsprint
anyway though you forego any extra arguments likedigits
)don't forget about complex numbers! The functions to operate on them (
Re
,Im
,Mod
,Arg
) have quite short names which can occasionally be useful, and complex numbers as a concept can sometimes yield simple solutions to some calculations.for functions with very long names (>13–15 characters), you can use
get
to get at the function. For example, in R 3.4.4 with no packages loaded other than the default,get(ls(9)[501])
is more economical thangetDLLRegisteredRoutines
. This can also get around source code restrictions such as this answer. Note that using this trick makes your code R-version-dependent (and perhaps platform dependent), so make sure you include the version in your header so it can be reproduced if necessary.