How to Correctly Use Lists in R?
Just to address the last part of your question, since that really points out the difference between a list
and vector
in R:
Why do these two expressions not return the same result?
x = list(1, 2, 3, 4); x2 = list(1:4)
A list can contain any other class as each element. So you can have a list where the first element is a character vector, the second is a data frame, etc. In this case, you have created two different lists. x
has four vectors, each of length 1. x2
has 1 vector of length 4:
> length(x[[1]])
[1] 1
> length(x2[[1]])
[1] 4
So these are completely different lists.
R lists are very much like a hash map data structure in that each index value can be associated with any object. Here's a simple example of a list that contains 3 different classes (including a function):
> complicated.list <- list("a"=1:4, "b"=1:3, "c"=matrix(1:4, nrow=2), "d"=search)
> lapply(complicated.list, class)
$a
[1] "integer"
$b
[1] "integer"
$c
[1] "matrix"
$d
[1] "function"
Given that the last element is the search function, I can call it like so:
> complicated.list[["d"]]()
[1] ".GlobalEnv" ...
As a final comment on this: it should be noted that a data.frame
is really a list (from the data.frame
documentation):
A data frame is a list of variables of the same number of rows with unique row names, given class ‘"data.frame"’
That's why columns in a data.frame
can have different data types, while columns in a matrix cannot. As an example, here I try to create a matrix with numbers and characters:
> a <- 1:4
> class(a)
[1] "integer"
> b <- c("a","b","c","d")
> d <- cbind(a, b)
> d
a b
[1,] "1" "a"
[2,] "2" "b"
[3,] "3" "c"
[4,] "4" "d"
> class(d[,1])
[1] "character"
Note how I cannot change the data type in the first column to numeric because the second column has characters:
> d[,1] <- as.numeric(d[,1])
> class(d[,1])
[1] "character"
Regarding your questions, let me address them in order and give some examples:
1) A list is returned if and when the return statement adds one. Consider
R> retList <- function() return(list(1,2,3,4)); class(retList())
[1] "list"
R> notList <- function() return(c(1,2,3,4)); class(notList())
[1] "numeric"
R>
2) Names are simply not set:
R> retList <- function() return(list(1,2,3,4)); names(retList())
NULL
R>
3) They do not return the same thing. Your example gives
R> x <- list(1,2,3,4)
R> x[1]
[[1]]
[1] 1
R> x[[1]]
[1] 1
where x[1]
returns the first element of x
-- which is the same as x
. Every scalar is a vector of length one. On the other hand x[[1]]
returns the first element of the list.
4) Lastly, the two are different between they create, respectively, a list containing four scalars and a list with a single element (that happens to be a vector of four elements).
Just to take a subset of your questions:
This article on indexing addresses the question of the difference between []
and [[]]
.
In short [[]] selects a single item from a list and []
returns a list of the selected items. In your example, x = list(1, 2, 3, 4)'
item 1 is a single integer but x[[1]]
returns a single 1 and x[1]
returns a list with only one value.
> x = list(1, 2, 3, 4)
> x[1]
[[1]]
[1] 1
> x[[1]]
[1] 1