segfault in R using reshape2 package and dcast
This isn't an answer, but a simple (non-sensical) reproducible example that wouldn't fit in the comments. You can recreate this error with this simple example (on my MacBookPro).
require(reshape2)
n = 1448
df <- data.frame( Student = rep( 1:n , each = 2 ) , Grade = sample( 100 , n*2 , repl = TRUE ) )
df2 <- dcast( df , Student ~ Student , value.var = "Grade" , sum )
Error: segfault from C stack overflow
The error occurs at the boundary n = 1448
, i.e. it doesn't occur when n=1447
and below. It seems that the error is coming from split_indices
in split-numeric.c
from the package plyr
. It could have to do with the fact that the number of grouping levels is assigned to an (unsigned?) integer value, and if the number of groups goes over 32767 it causes a memory access error, but TBH I'm clutching at straws now.
My sessionInfo()
in case anyone can't recreate this error is:
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reshape2_1.2.2
loaded via a namespace (and not attached):
[1] plyr_1.8 stringr_0.6.2
Interestingly, if I run the df2 <-
command again after getting the first error, R crashes out completely and I get some OS generated error report. I include the relevant portion of the crash log here:
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_PROTECTION_FAILURE at 0x00007fff5f3ff120
VM Regions Near 0x7fff5f3ff120:
JS JIT generated code 00004d431a401000-00004d431a402000 [ 4K] ---/rwx SM=NUL
--> STACK GUARD 00007fff5bc00000-00007fff5f400000 [ 56.0M] ---/rwx SM=NUL stack guard for thread 0
Stack 00007fff5f400000-00007fff5fc00000 [ 8192K] rw-/rwx SM=COW thread 0
Application Specific Information:
objc[57147]: garbage collection is OFF
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_c.dylib 0x00007fff897c4632 small_free_scan_madvise_free + 41
1 libsystem_c.dylib 0x00007fff897c5f06 szone_free_definite_size + 4186
2 libsystem_c.dylib 0x00007fff897fe789 free + 194
3 libR.dylib 0x0000000100222dbf R_gc_internal + 7327 (memory.c:952)
4 libR.dylib 0x0000000100224919 Rf_allocVector + 841 (memory.c:2356)
5 plyr.so 0x000000010144bd2c split_indices + 204 (split-numeric.c:23)
6 libR.dylib 0x00000001001b4cc7 do_dotcall + 16311 (dotcode.c:593)
7 libR.dylib 0x00000001001e4448 Rf_eval + 1672 (eval.c:494)
8 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
9 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
10 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
11 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
12 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
13 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
14 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
15 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
16 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
17 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
18 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
19 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
20 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
21 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
22 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
23 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
24 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
25 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
26 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
27 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
28 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
29 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
30 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
31 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
32 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
33 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
34 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
35 libR.dylib 0x000000010021c761 R_ReplDLLdo1 + 481 (main.c:362)
36 org.R-project.R 0x0000000100022c24 run_REngineRmainloop + 196
37 org.R-project.R 0x00000001000159b7 -[REngine runREPL] + 119
38 org.R-project.R 0x0000000100001f24 main + 852
39 org.R-project.R 0x0000000100001914 start + 52
I'm having a same problem in pivoting a long table to wide one using dcast in package reshape2. I found solution in this post plyr split_indices function crashes for long vectors. Specifically, you could download the split_numeric.c and loop-apply.c in this page https://github.com/hadley/plyr/tree/master/src. Uninstall the package plyr from R console, and finally reinstall the package locally: install.packages('/path/to/source', repos=NULL, type='source').
This solves my problem, hope it helps.