What's the biggest R-gotcha you've run across?
[Hadley pointed this out in a comment.]
When using a sequence as an index for iteration, it's better to use the seq_along()
function rather than something like 1:length(x)
.
Here I create a vector and both approaches return the same thing:
> x <- 1:10
> 1:length(x)
[1] 1 2 3 4 5 6 7 8 9 10
> seq_along(x)
[1] 1 2 3 4 5 6 7 8 9 10
Now make the vector NULL
:
> x <- NULL
> seq_along(x) # returns an empty integer; good behavior
integer(0)
> 1:length(x) # wraps around and returns a sequence; this is bad
[1] 1 0
This can cause some confusion in a loop:
> for(i in 1:length(x)) print(i)
[1] 1
[1] 0
> for(i in seq_along(x)) print(i)
>
The automatic creation of factors when you load data. You unthinkingly treat a column in a data frame as characters, and this works well until you do something like trying to change a value to one that isn't a level. This will generate a warning but leave your data frame with NA's in it ...
When something goes unexpectedly wrong in your R script, check that factors aren't to blame.
Forgetting the drop=FALSE argument in subsetting matrices down to single dimension and thereby dropping the object class as well:
R> X <- matrix(1:4,2)
R> X
[,1] [,2]
[1,] 1 3
[2,] 2 4
R> class(X)
[1] "matrix"
R> X[,1]
[1] 1 2
R> class(X[,1])
[1] "integer"
R> X[,1, drop=FALSE]
[,1]
[1,] 1
[2,] 2
R> class(X[,1, drop=FALSE])
[1] "matrix"
R>