我有一份数据
I_Code Date_1 Date_2 C_B
4 01/09/2019 02/08/2019 7
4 01/09/2019 03/08/2019 1189
1 02/09/2019 03/08/2019 3
2 02/09/2019 03/08/2019 4
1 01/09/2019 04/08/2019 8
4 01/09/2019 04/08/2019 1650
1 02/09/2019 04/08/2019 138
2 02/09/2019 04/08/2019 8
1 03/09/2019 04/08/2019 6
2 03/09/2019 04/08/2019 2
1 01/09/2019 05/08/2019 9
4 01/09/2019 05/08/2019 1805
1 02/09/2019 05/08/2019 259
2 02/09/2019 05/08/2019 1062
1 03/09/2019 05/08/2019 4
2 03/09/2019 05/08/2019 161
2 04/09/2019 05/08/2019 7
5 04/09/2019 05/08/2019 12
1 01/09/2019 06/08/2019 63
2 01/09/2019 06/08/2019 2我想要计算一个列c_b1,它是为每个date_1计算的,而I_code是为每个date_1和I_Code计算的,因此我想为下一个条目减去C_b。例如-
I_Code Date_1 Date_2 C_B C_B1
4 01/09/2019 02/08/2019 7 7
4 01/09/2019 03/08/2019 1189 1189-7 (Since date_1 and I_Code are same in entry 1)
1 02/09/2019 03/08/2019 3 3
2 02/09/2019 03/08/2019 4 4
1 01/09/2019 04/08/2019 8 8
4 01/09/2019 04/08/2019 1650 1650-1189 (Since date_1 and I_Code are same in entry 2)
1 02/09/2019 04/08/2019 138 138-3
2 02/09/2019 04/08/2019 8 8-3有人能帮我实现同样的目标吗。
谢谢
发布于 2019-10-24 06:42:00
使用DataFrameGroupBy.shift,将缺失的值替换为0并减去:
df['C_B1'] = df['C_B'] - df.groupby(['I_Code','Date_1'])['C_B'].shift().fillna(0)Series.sub和fill_value=0参数的替代方案:
df['C_B1'] = df['C_B'].sub(df.groupby(['I_Code','Date_1'])['C_B'].shift(), fill_value=0)print (df)
I_Code Date_1 Date_2 C_B C_B1
0 4 01/09/2019 02/08/2019 7 7.0
1 4 01/09/2019 03/08/2019 1189 1182.0
2 1 02/09/2019 03/08/2019 3 3.0
3 2 02/09/2019 03/08/2019 4 4.0
4 1 01/09/2019 04/08/2019 8 8.0
5 4 01/09/2019 04/08/2019 1650 461.0
6 1 02/09/2019 04/08/2019 138 135.0
7 2 02/09/2019 04/08/2019 8 4.0
8 1 03/09/2019 04/08/2019 6 6.0
9 2 03/09/2019 04/08/2019 2 2.0
10 1 01/09/2019 05/08/2019 9 1.0
11 4 01/09/2019 05/08/2019 1805 155.0
12 1 02/09/2019 05/08/2019 259 121.0
13 2 02/09/2019 05/08/2019 1062 1054.0
14 1 03/09/2019 05/08/2019 4 -2.0
15 2 03/09/2019 05/08/2019 161 159.0
16 2 04/09/2019 05/08/2019 7 7.0
17 5 04/09/2019 05/08/2019 12 12.0
18 1 01/09/2019 06/08/2019 63 54.0
19 2 01/09/2019 06/08/2019 2 2.0发布于 2019-10-24 06:43:05
IIUC:
df["C_B_1"] = df.groupby(["Date_1","I_Code"])["C_B"].apply(lambda x: x-x.shift()).fillna(df['C_B'])
print (df)
#
I_Code Date_1 Date_2 C_B C_B_1
0 4 01/09/2019 02/08/2019 7 7.0
1 4 01/09/2019 03/08/2019 1189 1182.0
2 1 02/09/2019 03/08/2019 3 3.0
3 2 02/09/2019 03/08/2019 4 4.0
4 1 01/09/2019 04/08/2019 8 8.0
5 4 01/09/2019 04/08/2019 1650 461.0
6 1 02/09/2019 04/08/2019 138 135.0
7 2 02/09/2019 04/08/2019 8 4.0
8 1 03/09/2019 04/08/2019 6 6.0
9 2 03/09/2019 04/08/2019 2 2.0
10 1 01/09/2019 05/08/2019 9 1.0
11 4 01/09/2019 05/08/2019 1805 155.0
12 1 02/09/2019 05/08/2019 259 121.0
13 2 02/09/2019 05/08/2019 1062 1054.0
14 1 03/09/2019 05/08/2019 4 -2.0
15 2 03/09/2019 05/08/2019 161 159.0
16 2 04/09/2019 05/08/2019 7 7.0
17 5 04/09/2019 05/08/2019 12 12.0
18 1 01/09/2019 06/08/2019 63 54.0
19 2 01/09/2019 06/08/2019 2 2.0https://stackoverflow.com/questions/58535406
复制相似问题