Why do two references to the same vector return different memory addresses for each element of the vector?
Any R object is a C (pointer -called SEXP
- to a) "multi-object" (struct
). This includes information (that R needs to operate, e.g. length
, number of references -to know when to copy an object- and more) about the R object and, also, the actual data of the R object that we have access to.
lobstr::obj_addr
, presumably, returns the memory address that a SEXP
points to. That part of the memory contains both the information about and the data of the R object. From within the R environment we can't/don't need to access the (pointer to the) memory of the actual data in each R object.
As Adam notes in his answer, the function [
copies the nth element of the data contained in the C object to a new C object and returns its SEXP
pointer to R. Each time [
is called, a new C object is created and returned to R.
We can't access the memory address of each element of the actual data of our object through R. But playing a bit around, we can trace the respective addresses using the C api:
A function to get the addresses:
ff = inline::cfunction(sig = c(x = "integer"), body = '
Rprintf("SEXP @ %p\\n", x);
Rprintf("first element of SEXP actual data @ %p\\n", INTEGER(x));
for(int i = 0; i < LENGTH(x); i++)
Rprintf("<%d> @ %p\\n", INTEGER(x)[i], INTEGER(x) + i);
return(R_NilValue);
')
And applying to our data:
x = c(1500L, 2400L, 8800L) #converted to "integer" for convenience
y = x
lobstr::obj_addr(x)
#[1] "0x1d1c0598"
lobstr::obj_addr(y)
#[1] "0x1d1c0598"
ff(x)
#SEXP @ 0x1d1c0598
#first element of SEXP actual data @ 0x1d1c05c8
#<1500> @ 0x1d1c05c8
#<2400> @ 0x1d1c05cc
#<8800> @ 0x1d1c05d0
#NULL
ff(y)
#SEXP @ 0x1d1c0598
#first element of SEXP actual data @ 0x1d1c05c8
#<1500> @ 0x1d1c05c8
#<2400> @ 0x1d1c05cc
#<8800> @ 0x1d1c05d0
#NULL
The successive memory difference between our object's data elements equals the size of int
type:
diff(c(strtoi("0x1d1c05c8", 16),
strtoi("0x1d1c05cc", 16),
strtoi("0x1d1c05d0", 16)))
#[1] 4 4
Using the [
function:
ff(x[1])
#SEXP @ 0x22998358
#first element of SEXP actual data @ 0x22998388
#<1500> @ 0x22998388
#NULL
ff(x[1])
#SEXP @ 0x22998438
#first element of SEXP actual data @ 0x22998468
#<1500> @ 0x22998468
#NULL
This might be a more than needed extensive answer and is simplistic on the actual technicalities, but, hopefully, offers a clearer "big" picture.
This is one way to look at it. I am sure there is a more technical view. Remember that in R, nearly everything is a function. This includes the extract function, [
. Here is an equivalent statement to x[1]
:
> `[`(x, 1)
[1] 1500
So what you are doing is running a function which returns a value (check out ?Extract
). That value is an integer. When you run obj_addr(x[1])
, it is evaluating the function x[1]
and then giving you the obj_addr()
of that function return, not the address of the first element of the array that you bound to both x
and y
.