文章/答案/技术大牛

发布

问按组在R中创建组合
EN

Stack Overflow用户

提问于 2019-08-31 04:05:27

回答 6查看 3.6K关注 0票数 11

我想为我的教室创建一个列表，其中包括每个可能的4个学生组。如果我有20个学生，我如何在R中按组创建，其中我的行是每个组合，有20列用于完整的学生in列表，列1-4是"group1"，5-9是"group2“等等。

下面列出了每组4名学生(x1、x2、x3和x4)的可能组合。现在，对于列出的每一行，其他4组4名学生的可能性是什么？因此，应该有20列(Group1_1:4、Group2_1:4、Group3_1:4、Group4_1:4、Group5_1:4)。

combn(c(1:20), m = 4)

期望输出

Combination 1 = Group1[1, 2, 3, 4] Group2[5, 6, 7, 8], Group3[9, 10, 11, 12], etc. 
Combination 2 = Group1[1, 2, 3, 5]... etc.

有很多关于组合的帖子，可能已经有人回答了，但我就是找不到。如有任何帮助，我们不胜感激！

algorithm

statistics

combinations

回答 6

Stack Overflow用户

发布于 2019-09-09 05:46:46

从计算上讲，这是一个具有挑战性的问题，因为我相信有25亿种可能性需要枚举。(如果它是错误的，我欢迎任何关于这种方法哪里出错的见解。)

根据存储方式的不同，包含所有这些分组的表可能需要比大多数计算机所能处理的更多的RAM。如果能看到一种有效的方法来创建它，我会印象深刻的。如果我们采取“一次创建一个组合”的方法，如果我们每秒可以生成1,000,000个组合，那么仍然需要41分钟来生成所有的可能性，或者如果我们每秒只能生成1,000个组合，则需要一个月的时间。

编辑-在底部添加了部分实现，以创建从#1到#2,546,168,625的任何所需分组。对于某些目的，这可能几乎与实际存储整个序列一样好，因为整个序列非常大。

假设我们要分成5组，每组4名学生: A组、B组、C组、D组和E组。

让我们将组A定义为学生#1所在的组。他们可以与其他19名学生中的任何三名配对。我相信其他学生有969个这样的组合：

> nrow(t(combn(1:19, 3)))
[1] 969

现在剩下16名学生去其他小组了。让我们将第一个不在A组中的学生分配到B组。这可能是学生2、3、4或5。这并不重要；我们只需要知道只有15个学生可以与该学生配对。有455个这样的组合：

> nrow(t(combn(1:15, 3)))
[1] 455

现在还剩下12个学生。再次，让我们将第一个未分组的学生分配到C组，剩下165个组合供他们和其他11个学生使用：

> nrow(t(combn(1:11, 3)))
[1] 165

我们还剩下8名学生，其中7名可以与第一个未分组的学生以35种方式配对进入D组：

> nrow(t(combn(1:7, 3)))
[1] 35

然后，一旦确定了其他组，就只剩下一组四个学生了，其中三个可以与第一个未分组的学生配对：

> nrow(t(combn(1:3, 3)))
[1] 1

这意味着25.46亿个组合：

> 969*455*165*35*1
[1] 2546168625

下面是一个正在进行中的函数，它根据任意序列号生成一个分组。

1)正在进行将序列号转换为向量，该向量描述A、B、C、D和E组应使用哪个#组合。例如，这应将#1转换为c(1, 1, 1, 1, 1)，将#2,546,168,625转换为c(969, 455, 165, 35, 1)。

2)将组合转换为描述每组学生的特定输出。

groupings <- function(seq_nums) {
  students <- 20
  group_size = 4
  grouped <- NULL
  remaining <- 1:20
  seq_nums_pad <- c(seq_nums, 1) # Last group always uses the only possible combination
  for (g in 1:5) {
    group_relative <- 
      c(1, 1 + t(combn(1:(length(remaining) - 1), group_size - 1))[seq_nums_pad[g], ])
    group <- remaining[group_relative]
    print(group)
    grouped = c(grouped, group)
    remaining <-  setdiff(remaining, grouped)
  }
}

> groupings(c(1,1,1,1))
#[1] 1 2 3 4
#[1] 5 6 7 8
#[1]  9 10 11 12
#[1] 13 14 15 16
#[1] 17 18 19 20
> groupings(c(1,1,1,2))
#[1] 1 2 3 4
#[1] 5 6 7 8
#[1]  9 10 11 12
#[1] 13 14 15 17
#[1] 16 18 19 20
> groupings(c(969, 455, 165, 35))   # This one uses the last possibility for
#[1]  1 18 19 20                    #   each grouping.
#[1]  2 15 16 17
#[1]  3 12 13 14
#[1]  4  9 10 11
#[1] 5 6 7 8

票数 5

Stack Overflow用户

发布于 2019-09-09 10:50:51

这在很大程度上依赖于这个答案：

Algorithm that can create all combinations and all groups of those combinations

要注意的一件事是，答案并不是那么动态-它只包含了一个针对3人组的解决方案。为了使其更加健壮，我们可以基于输入参数创建代码。也就是说，为组3动态创建以下递归函数：

group <- function(input, step){
 len <- length(input) 
 combination[1, step] <<- input[1] 

 for (i1 in 2:(len-1)) { 
   combination[2, step] <<- input[i1] 

   for (i2 in (i1+1):(len-0)) { 
     combination[3, step] <<- input[i2] 

     if (step == m) { 
       print(z); result[z, ,] <<- combination 
       z <<- z+1 
     } else { 
       rest <- setdiff(input, input[c(i1,i2, 1)]) 
       group(rest, step +1) #recursive if there are still additional possibilities
   }} 
 } 
}

对于N = 16和k = 4，这大约需要55秒。我想把它翻译成Rcpp，但不幸的是我没有这样的技能。

group_N <- function(input, k = 2) {
  N = length(input)
  m = N/k
  combos <- factorial(N) / (factorial(k)^m * factorial(m))

  result <- array(NA_integer_, dim = c(combos, m, k))
  combination = matrix(NA_integer_, nrow = k, ncol = m)

  z = 1

  group_f_start = 'group <- function(input, step){\n len <- length(input) \n combination[1,  step] <<- input[1] \n '
  i_s <- paste0('i', seq_len(k-1))

  group_f_fors = paste0('for (', i_s, ' in ', c('2', if (length(i_s) != 1) {paste0('(', i_s[-length(i_s)], '+1)')}), ':(len-', rev(seq_len(k)[-k])-1, ')) { \n combination[', seq_len(k)[-1], ', step] <<- input[', i_s, '] \n', collapse = '\n ')

  group_f_inner = paste0('if (step == m) { \n result[z, ,] <<- combination \n z <<- z+1 \n } else { \n rest <- setdiff(input, input[c(',
                         paste0(i_s, collapse = ','),
                         ', 1)]) \n group(rest, step +1) \n }')

  eval(parse(text = paste0(group_f_start, group_f_fors, group_f_inner, paste0(rep('}', times = k), collapse = ' \n '))))

  group(input, 1)
  return(result)
}

Performance

system.time({test_1 <- group_N(seq_len(4), 2)})
#   user  system elapsed 
#   0.01    0.00    0.02
library(data.table)

#this funky step is just to better show the groups. the provided
## array is fine.

as.data.table(t(rbindlist(as.data.table(apply(test_1, c(1,3), list)))))
#    V1  V2
#1: 1,2 3,4
#2: 1,3 2,4
#3: 1,4 2,3

system.time({test_1 <- group_N(seq_len(16), 4)})
#   user  system elapsed 
#  55.00    0.19   55.29 

as.data.table(t(rbindlist(as.data.table(apply(test_1, c(1,3), list)))))
#very slow
#                  V1          V2          V3          V4
#      1:     1,2,3,4     5,6,7,8  9,10,11,12 13,14,15,16
#      2:     1,2,3,4     5,6,7,8  9,10,11,13 12,14,15,16
#      3:     1,2,3,4     5,6,7,8  9,10,11,14 12,13,15,16
#      4:     1,2,3,4     5,6,7,8  9,10,11,15 12,13,14,16
#      5:     1,2,3,4     5,6,7,8  9,10,11,16 12,13,14,15
#     ---                                                
#2627621:  1,14,15,16  2,11,12,13  3, 6, 9,10     4,5,7,8
#2627622:  1,14,15,16  2,11,12,13     3,7,8,9  4, 5, 6,10
#2627623:  1,14,15,16  2,11,12,13  3, 7, 8,10     4,5,6,9
#2627624:  1,14,15,16  2,11,12,13  3, 7, 9,10     4,5,6,8
#2627625:  1,14,15,16  2,11,12,13  3, 8, 9,10     4,5,6,7

票数 5

Stack Overflow用户

发布于 2019-08-31 20:06:15

下面的代码可以工作。

# Create list of the 20 records
list <- c(1:20)

# Generate all combinations including repetitions
c <- data.frame(expand.grid(rep(list(list), 4))); rm(list)
c$combo <- paste(c$Var1, c$Var2, c$Var3, c$Var4)
# Remove repetitions
c <- subset(c, c$Var1 != c$Var2 & c$Var1 != c$Var3 & c$Var1 != c$Var4 & c$Var2 != c$Var3 & c$Var2 != c$Var4 & c$Var3 != c$Var4)

# Create common group labels (ex. abc, acb, bac, bca, cab, cba would all have "abc" as their group label).
key <- data.frame(paste(c$Var1, c$Var2, c$Var3, c$Var4))
key$group  <- apply(key, 1, function(x) paste(sort(unlist(strsplit(x, " "))), collapse = " "))
c$group <- key$group; rm(key)

# Sort by common group label and id combos by group
c <- c[order(c$group),]
c$Var1 <- NULL; c$Var2 <- NULL; c$Var3 <- NULL; c$Var4 <- NULL;
c$rank <- rep(1:24)

# Pivot
c <- reshape(data=c,idvar="group", v.names = "combo", timevar = "rank", direction="wide")

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/57732672

复制

相似问题

问按组在R中创建组合
EN

回答 6

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问按组在R中创建组合EN

回答 6

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问按组在R中创建组合
EN