对不起,我是新的R,并将非常感谢在这方面的一些帮助。我试图根据时间合并以下两种数据(劳动生产率和抑郁框架):
Time LabourProductivity
1 2004 Q1 96.6
2 Q2 96.9
3 Q3 96.9
4 Q4 97.1
5 2005 Q1 97.6
6 Q2 99.0和
Time DepressionCount
1 2004 875
2 2004.25 820
3 2004.5 785
4 2004.75 857
5 2005 844
6 2005.25 841由于它们都有不同的价值时间,我不知道如何合并它们。理想的情况是:
Time DepressionCount LabourProductivity
1 2004 875 96.6
2 2004 820 96.9
3 2004 785 96.9
4 2004 857 97.1
5 2005 844 97.6
6 2005 841 99.0发布于 2015-03-06 21:21:11
如果"df1“和"df2”是第一和第二数据集,则根据"df1“的”时间“列创建分组索引("indx")。使用df2和ave将"Time“列转换为类似于”as.yearqtr“的格式
library(zoo)
indx <- cumsum(grepl('^\\d+', df1$Time))
df1$Time <- with(df1, as.numeric(ave(Time, indx, FUN= function(x) {
x[-1] <- paste (sub(' .*', '', x[1]), x[-1])
as.yearqtr(x) })))merge数据集,并在需要时transform "Time“列
transform(merge(df1, df2), Time=trunc(Time))
# Time LabourProductivity DepressionCount
#1 2004 96.6 875
#2 2004 96.9 820
#3 2004 96.9 785
#4 2004 97.1 857
#5 2005 97.6 844
#6 2005 99.0 841或者使用data.table
library(data.table)
setDT(df1)[, TimeN:= as.numeric(as.yearqtr(c(Time[1L],
paste(sub(' .*', '', Time[1L]), Time[-1L])))),
list(Grp=cumsum(grepl('^\\d+', Time)))][,
Time:= TimeN][, TimeN:=NULL][]
setkey(df1, Time)[df2][, Time:=trunc(Time)][]
# Time LabourProductivity DepressionCount
#1: 2004 96.6 875
#2: 2004 96.9 820
#3: 2004 96.9 785
#4: 2004 97.1 857
#5: 2005 97.6 844
#6: 2005 99.0 841数据
df1 <- structure(list(Time = c("2004 Q1", "Q2", "Q3", "Q4", "2005 Q1",
"Q2"), LabourProductivity = c(96.6, 96.9, 96.9, 97.1, 97.6, 99
)), .Names = c("Time", "LabourProductivity"), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6"))
df2 <- structure(list(Time = c(2004, 2004.25, 2004.5, 2004.75, 2005,
2005.25), DepressionCount = c(875L, 820L, 785L, 857L, 844L, 841L
)), .Names = c("Time", "DepressionCount"), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6"))https://stackoverflow.com/questions/28907680
复制相似问题