Garbage collection of seemingly PROTECTed pairlist
Instead of trying to grow then convert a pairlist, you can use a standard list (a VECSXP
). The reason why you don't need to grow a list is that a quick one-line loop through your matrix will tell you how many "gaps" you have in your numbers, and therefore how many vectors you need to pre-allocate in the list. This turns out to make things considerably simpler and probably a bit more efficient too.
The other changes I have made are to move to a single helper function, which simply assigns a length-2 integer vector from two int
s, and to UNPROTECT
en masse at the end of your C_fullocate
function. This is simple to do, since we have only assigned one vector per element of the final list, plus the list itself.
The function for creating length-2 INTSXP
s from two int
s looks like this:
#include <Rinternals.h>
SEXP C_intsxp2(int first, int second)
{
SEXP out = PROTECT(Rf_allocVector(INTSXP, 2));
INTEGER(out)[0] = first;
INTEGER(out)[1] = second;
UNPROTECT(1);
return out;
}
And your main function becomes:
SEXP C_fullocate(SEXP int_mat)
{
int rows = Rf_nrows(int_mat);
int *values = INTEGER(int_mat);
int total_rows = rows;
int rownum = 1;
// Counts how many elements we need in our list
for(int i = 0; i < (rows - 1); ++i) {
if(values[rows + i] != values[i + 1] - 1) ++total_rows;
}
// Creates the main list we will output at the end of the function
SEXP list = PROTECT(Rf_allocVector(VECSXP, total_rows));
// Creates and assigns first row
SET_VECTOR_ELT(list, 0, PROTECT(C_intsxp2(values[0], values[rows])));
for(int i = 1; i < rows; ++i) // Cycle through rest of the rows
{
if(values[rows + i - 1] != values[i] - 1) // Insert extra row if there's a gap
{
SEXP extra = PROTECT(C_intsxp2(values[rows + i - 1] + 1, values[i] - 1));
SET_VECTOR_ELT(list, rownum++, extra);
}
// Copy next row of original matrix into our list
SEXP next_row = PROTECT(C_intsxp2(values[i], values[i + rows]));
SET_VECTOR_ELT(list, rownum++, next_row);
}
UNPROTECT(total_rows + 1); // Unprotects all assigned rows plus main list
return list;
}
So in R we have
test_mat <- matrix(as.integer(c(2, 10, 11, 20, 30, 40, 50, 60)),
ncol = 2, byrow = TRUE)
test_mat
#> [,1] [,2]
#> [1,] 2 10
#> [2,] 11 20
#> [3,] 30 40
#> [4,] 50 60
And we can do:
fullocate(test_mat)
#> [[1]]
#> [1] 2 10
#>
#> [[2]]
#> [1] 11 20
#>
#> [[3]]
#> [1] 21 29
#>
#> [[4]]
#> [1] 30 40
#>
#> [[5]]
#> [1] 41 49
#>
#> [[6]]
#> [1] 50 60
Of course, the whole thing can be done much more simply using a single function in Rcpp. Here's an example where you can just grow the list, making the code considerably simpler (if maybe a little less efficient).
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
List fullocate(IntegerMatrix m)
{
List l = List::create(m(0, _));
for(int i = 1; i < m.nrow(); ++i)
{
if(m(i, 0) != m(i - 1, 1) + 1){
l.push_back(NumericVector::create(m(i - 1, 1) + 1, m(i, 0) - 1));
}
l.push_back(NumericVector::create(m(i, 0), m(i, 1)));
}
return l;
}
This one is really complicated. You made a great effort to create a reproducible example of this hard to track error.
I tried fixing your problem, unfortunately I failed. But nevertheless I'll try to share my findings with you, since nobody else answered so far (maybe it helps)
I installed your testpkg
and additionally added the fullocate
function to the namespace. To have it as an exported function.
This way I was able to build
the package and run the function with testpkg::fullocate(int_mat)
and to run it via devtools::check()
.
Interestingly if I run it via check()
it fails everytime, when running your testthat test.
Running ‘testthat.R’:
── Test failures ───────────────────────── testthat ────
library(testthat)
library(testpkg)
test_check("testpkg")
row_num: 2
[[1]]
.Primitive("for")
here1row_num: 3
[[1]]
.Primitive("for")
[[2]]
[[2]][[1]]
*** caught segfault ***
address 0xa00000007, cause 'memory not mapped'
Traceback:
1: fullocate(int_mat)
2: eval_bare(expr, quo_get_env(quo))
3: quasi_label(enquo(object), label, arg = "object")
4: expect_equal(fullocate(int_mat), list(c(5L, 6L), c(7L, 10L), c(11L, 19L), c(20L, 30L)))
5: eval(code, test_env)
6: eval(code, test_env)
7: withCallingHandlers({ eval(code, test_env) if (!handled && !is.null(test)) { skip_empty() }}, expectation = handle_expectation, skip = handle_skip, warning = handle_warning, message = handle_message, error = handle_error)
8: doTryCatch(return(expr), name, parentenv, handler)
So pretty similar to what you got, some memory issue:
address 0xa00000007, cause 'memory not mapped'
When I just run the function, interestingly, I can run it several times successfully, until it gives an error. Seems kind of random if it succeeds or not. From time to time the complete R session crashes.
Here is the error I get when running it without check()
.
Fehler in h(simpleError(msg, call)) : Fehler bei der Auswertung des Argumentes 'object' bei der Methodenauswahl für Funktion 'show': nicht implementierter Typ (27) in 'eval' Fehler während wrapup: nicht implementierter Typ (27) in 'lazy_duplicate'
Error: no more error handlers available (recursive errors?); invoking 'abort' restart Here is the error messages I get:
Fehler in h(simpleError(msg, call)) :
Fehler bei der Auswertung des Argumentes 'object' bei der Methodenauswahl für Funktion 'show': nicht implementierter Typ (27) in 'eval'
Fehler während wrapup: nicht implementierter Typ (27) in 'lazy_duplicate'
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Does not say too much...
I actually had some ideas why it might have failed based on the Writing R Extensions Manual. There is a special section about the C Garbage Collection issues. (https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Garbage-Collection) This is definitely worth a look, if you have not read it yet.
Some interesting things to check:
Notice that it is the object which is protected, not the pointer variable. It is a common mistake to believe that if you invoked PROTECT(p) at some point then p is protected from then on, but that is not true once a new object is assigned to p.
In some cases it is necessary to keep better track of whether protection is really needed. Be particularly aware of situations where a large number of objects are generated. The pointer protection stack has a fixed size (default 10,000) and can become full.
Shouldn't be the second case, since the test example is quite small ;) From the fact, that the problem occurs so random, I (like you) would guess somethings what needs to be Protected isn't actually protected.
I wasn't so sure about the point of code, which you pointed out as the cause of the failure. But if Rf_PrintValue(prlst);
really is always the point, where the error occurs - it might be an indicator, to closer check prlst and what is inside.
As I told - in the end I couldn't fix it - but I also did not spend too much time on it.
Function C_int_mat_nth_row_nrnc
is writing values beyond the allocated limits.
- Allocation in line 5 is of size
nc
. - Then, line
12
usesnr
as a limit - ... which is larger than
nc
in line 39.
SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
SEXP out = PROTECT(Rf_allocVector(INTSXP, nc)); // allocating with `nc`
...
for (int i = 0; i != nr; ++i) { // but `nr` is used as a limit
out_int[i] = ...
}
}
...
SEXP C_fullocate(SEXP int_mat) {
...
row_num = 2;
while (row_num <= nr) {
...
SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num)); // !!!
...
}
}