How to repeatedly generate non-repeating smaller groups from a larger set

I see that the OP has provided a solution from the linked math.so solution, but I would like to provide a working solution of the other answer on that page that gets to the heart of this problem. That solution mentions Round-robin tournament. From the wikipedia page, the algorithm is straightforward.

One simply fixes a position in a matrix and rotates the other indices clockwise. Given M initial players, there are M - 1 unique rounds. Thus, for our given situation, we can only obtain 9 unique sets of groups.

Below, is a very straightforward base R implementation:

roll <- function( x , n ){
    if( n == 0 )
        return(x)
    c(tail(x,n), head(x,-n))
}

RoundRobin <- function(m, n) {
    m <- as.integer(m)
    n <- as.integer(n)
    
    if (m %% 2L != 0L) {
        m <- m + 1L
    }
    
    myRounds <- list(n)
    myRounds[[1]] <- 1:m
    
    for (i in 2:n) {
        myRounds[[i]] <- myRounds[[i - 1L]]
        myRounds[[i]][2:m] <- roll(myRounds[[i]][-1], 1)
    }
    
    lapply(myRounds, matrix, nrow = 2)
}

The roll function was obtained from this answer.

Here is sample output for 10 students and 4 weeks:

RoundRobin(10, 4)
[[1]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

[[2]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    4    6    8
[2,]   10    3    5    7    9

[[3]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1   10    3    5    7
[2,]    9    2    4    6    8

[[4]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    9    2    4    6
[2,]    8   10    3    5    7

When we hit the 10th week, we see our first repeat "round".

RoundRobin(10, 13)[c(1, 2, 9, 10, 11)]
[[1]]
     [,1] [,2] [,3] [,4] [,5]   ## <- first week
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

[[2]]
     [,1] [,2] [,3] [,4] [,5]   ## <- second week
[1,]    1    2    4    6    8
[2,]   10    3    5    7    9

[[3]]
     [,1] [,2] [,3] [,4] [,5]   ## <- ninth week
[1,]    1    4    6    8   10
[2,]    3    5    7    9    2

[[4]]
     [,1] [,2] [,3] [,4] [,5]   ## <- tenth week
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

[[5]]
     [,1] [,2] [,3] [,4] [,5]   ## <- eleventh week
[1,]    1    2    4    6    8
[2,]   10    3    5    7    9

Note, this is a deterministic algorithm and given the simplicity, it is pretty efficient. E.g. if you have 1000 students and want to find all 999 unique pairings, you can run this function without fear:

system.time(RoundRobin(1000, 999))
   user  system elapsed 
  0.038   0.001   0.039

I think you maybe want something like this. It will produce a data frame with the unique combinations in rows. These are sampled randomly until all unique combinations are exhausted. Thereafter, if more samples are required it will sample randomly with replacement from unique combinations:

create_groups <- function(M, N, samples)
{
  df <- seq(N) %>%
          lapply(function(x) M) %>%
          do.call(expand.grid, .) %>%
          apply(1, sort) %>%
          t() %>%
          as.data.frame() %>%
          unique()
  
  df <- df[apply(df, 1, function(x) !any(duplicated(x))), ]
  
  df <- df[sample(nrow(df)), ]
  
  if(samples <= nrow(df)) return(df[seq(samples), ])
  
  rbind(df, df[sample(seq(nrow(df)), samples - nrow(df), TRUE), ])
}

It's easy to see how it works if we want groups of 4 elements from 5 objects (there are only 5 possible combinations):

create_groups(letters[1:5], 4, 5)
#>   V1 V2 V3 V4
#> 1  a  b  d  e
#> 2  a  b  c  d
#> 3  a  c  d  e
#> 4  b  c  d  e
#> 5  a  b  c  e

We have a randomly-ordered sample of 4 objects drawn from the set, but no repeats. (the elements within each sample are ordered alphabetically however)

If we want more than 5 samples, the algorithm ensures that all unique combinations are exhausted before resampling:

create_groups(letters[1:5], 4, 6)
#>   V1 V2 V3 V4
#> 1  a  b  c  e
#> 2  a  c  d  e
#> 3  a  b  d  e
#> 4  b  c  d  e
#> 5  a  b  c  d
#> 6  a  b  d  e

Here we see there are no repeated rows until row 6, which is a repeat of row 3.

For the example in your question, there are 45 unique combinations of 2 elements drawn from 10 objects, so we get no repeats in our 13 samples:

create_groups(1:10, 2, 13)
#>    V1 V2
#> 1   7  8
#> 2   4 10
#> 3   2  8
#> 4   3 10
#> 5   3  9
#> 6   1  8
#> 7   4  9
#> 8   8  9
#> 9   7  9
#> 10  4  6
#> 11  5  7
#> 12  9 10
#> 13  4  7

I am not sure combn + sample can work for your goal

as.data.frame(t(combn(M, N))[sample(K <- choose(length(M), N), i, replace = K < i), ])

which gives

   V1 V2
1   4  9
2   4  8
3   1  9
4   6 10
5   5  9
6   2 10
7   3  7
8   7  8
9   6  7
10  1  7
11  6  8
12  5  6
13  3  8

Tags:

R

Permutation