Rcpp and int64 NA value

Alright, I think I found an answer... (not beautiful, but working).

Short Answer:

// [[Rcpp::export]]                                     
Rcpp::NumericVector foo() {
  Rcpp::NumericVector res(2);

  int64_t val = 1234567890123456789;
  std::memcpy(&(res[0]), &(val), sizeof(double));

  # This is the magic:
  int64_t v = 1ULL << 63;
  std::memcpy(&(res[1]), &(v), sizeof(double));

  res.attr("class") = "integer64";
  return res;
}

which results in

#> foo()
integer64
[1] 1234567890123456789 <NA>

Longer Answer

Inspecting how bit64 stores an NA

# the last value is the max value of a 64 bit number
a <- bit64::as.integer64(c(1, 2, NA, 9223372036854775807))
a
#> integer64
#> [1] 1    2    <NA> <NA>
bit64::as.bitstring(a[3])
#> [1] "1000000000000000000000000000000000000000000000000000000000000000"
bit64::as.bitstring(a[4])
#> [1] "1000000000000000000000000000000000000000000000000000000000000000"

Created on 2020-04-23 by the reprex package (v0.3.0)

we see that it is a 10000.... This can be recreated in Rcpp with int64_t val = 1ULL << 63;. Using memcpy() instead of a simple assign with = ensures that no bits are changed!


It's really much, much simpler. We have the behaviour of an int64 in R offered by (several) add-on packages the best of which is bit64 giving us the integer64 S3 class and associated behavior.

And it defines the NA internally as follows:

#define NA_INTEGER64 LLONG_MIN

And that is all that there is. R and its packages are foremost C code, and LLONG_MIN exists there and goes (almost) back all the way to founding fathers.

There are two lessons here. The first is the extension of IEEE defining NaN and Inf for floating point values. R actually goes way beyond and adds NA for each of its types. In pretty much the way above: by reserving one particular bit pattern. (Which, in one case, is the birthday of one of the two original R creators.)

The other is to admire the metric ton of work Jens did with the bit64 package and all the required conversion and operator functions. Seamlessly converting all possibly values, including NA, NaN, Inf, ... is no small task.

And it is a neat topic that not too many people know. I am glad you asked the question because we now have a record here.

Tags:

R

Na

Rcpp

Bit64