我有以下数据:
ID | Item |
1 | A |
1 | A |
1 | A |
1 | B |
2 | B |
2 | B |
1 | B |
1 | C |
2 | A |
2 | A |
3 | C |
3 | B |
3 | C |
3 | B |
3 | A |
2 | C |我想取两个最流行/发生的项目的每一个ID,并编码他们。期望的结果是:
ID | Item A | Item B | Item C |
1 | 1 | 1 | 0 |
2 | 1 | 1 | 0 |
3 | 0 | 1 | 1 |只要该项目在ID的前2名中,它将得到1的计数。我如何在R中进行操作?我在想group_by %>% top_n(n=2).
发布于 2020-02-25 03:42:15
这里有一个想法。有一件事要考虑的是,如果有联系,你想做什么?这里,我对列进行排序,然后对前两列进行slice排序。你可能会想出另一种处理领带的方法。
library(tidyverse)
dat2 <- dat %>%
count(ID, Item) %>%
arrange(ID, desc(n), Item) %>%
group_by(ID) %>%
slice(1:2) %>%
mutate(n = 1) %>%
pivot_wider(names_from = Item, values_from = n, values_fill = list(n = 0)) %>%
ungroup()
dat2
# # A tibble: 3 x 4
# ID A B C
# <int> <dbl> <dbl> <dbl>
# 1 1 1 1 0
# 2 2 1 1 0
# 3 3 0 1 1数据
dat <- read.table(text = "ID Item
1 A
1 A
1 A
1 B
2 B
2 B
1 B
1 C
2 A
2 A
3 C
3 B
3 C
3 B
3 A
2 C",
header = TRUE, stringsAsFactors = FALSE)发布于 2020-02-25 03:46:02
library(tidyverse)
df %>%
group_by(ID) %>%
count(Item) %>%
top_n(2, n) %>%
ungroup() %>%
pivot_wider(names_from = Item, values_from = n,
values_fn = list(n = ~ 1),
values_fill = list(n = 0))
# # A tibble: 3 x 4
# ID A B C
# <int> <dbl> <dbl> <dbl>
# 1 1 1 1 0
# 2 2 1 1 0
# 3 3 0 1 1描述
values_fn = list(n = ~ 1):将计数数转换为1(在pivot_wider)
values_fill = list(n = 0):之前添加mutate(n = 1)等于在缺少时指定要填充的0
https://stackoverflow.com/questions/60387115
复制相似问题