Fast sorted sample with replacement
Implementing @Frank's idea in Rcpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector sort_sample(int n) {
IntegerVector tab(n);
for (int i = 0; i < n; i++) {
int k = n * unif_rand();
tab[k]++;
}
return tab;
}
Benchmark:
microbenchmark::microbenchmark(
sort(sample(n, replace = TRUE)),
tabulate(sample(n, replace = TRUE), n),
sort_sample(n)
)
For n = 1e5:
Unit: microseconds
expr min lq mean median uq max neval
sort(sample(n, replace = TRUE)) 3214.726 3393.606 3667.1001 3507.0525 3716.3560 7007.411 100
tabulate(sample(n, replace = TRUE), n) 1087.543 1104.215 1245.0722 1136.9085 1218.5865 4265.075 100
sort_sample(n) 789.403 812.780 870.2723 837.3445 924.4395 1188.432 100
For n = 1e6:
Unit: milliseconds
expr min lq mean median uq max neval
sort(sample(n, replace = TRUE)) 49.96651 52.58784 61.19775 55.09312 58.43035 160.16275 100
tabulate(sample(n, replace = TRUE), n) 13.74391 14.44253 17.22742 15.99816 17.54367 48.72374 100
sort_sample(n) 12.80741 14.40371 17.98320 15.31699 17.53548 63.21692 100