我有一个dataframe,它有一个名称列、日期到达(类型date)和价格(类型numeric),如下所示
names Datearriv price
SUV 2019-01-16 84,35
HOR 2020-02-28 130,45
SUV 2019-01-16 235,12实际上,我的R项目有一个问题。我想比较它们的名字,如果它们是相同的,我会比较到达日期,如果它是相同的,我比较住宿的价格,程序必须删除住宿价格最高的行,即有税的那一行。
我写了这段代码,但它会出错。这是我的代码和数据帧
for(i in 1:96){nom=data[i,2] date = data[i,4] prix = data[i,8] z = i+1 for(j in z:97){ if(data[j,2] == nom){ if(data[j,4] == date){ if(data[j,8]>prix){ data = data[-j,] else{data = data[-i]}}}}}}发布于 2020-05-02 01:43:50
这是一种可能性:
d <- read.table(text="
names Datearriv price
SUV 2019-01-16 84,35
HOR 2020-02-28 130,45
SUV 2019-01-16 235,12", header=TRUE
)
library(dplyr)
# Note that your variable is not numeric yet
d <- d %>%
mutate( price = as.numeric(gsub(",", ".", price )))
# now can filter
d <- d %>%
group_by(names, Datearriv) %>% # define the groups
add_count(name="N") %>% # check how many observations per group
mutate( max = max(price) ) %>% # check the max price
filter(!(max==price & N>1) ) %>% # we can drop the observation if teh price is max and we have more than one
dplyr::select(names:price) # select relevant columns
d
# A tibble: 2 x 3
# Groups: names, Datearriv [2]
names Datearriv price
<fct> <fct> <dbl>
1 SUV 2019-01-16 84.4
2 HOR 2020-02-28 130. https://stackoverflow.com/questions/61546970
复制相似问题