我需要你的帮助。我已经盯着这个看了一整天了。我正在尝试找出这个月末和上个月末最后4个数字列中的每一个的差异,如果它们在级别1 (bucket)和级别2(薪水)上匹配,但级别0是我需要评估的日期。
Price Cost Value Gross
Date Bucket Salary
2021-10-31 30DPD 2 Missed 3.33 3.07 92,508.65 153,493.06
3 Missed 1.91 2.69 25,961.60 71,329.51
4-5 Missed 0.68 0.84 13,819.31 28,759.36
6-12 Missed 0.29 1.00 405.33 6,909.69
2021-11-30 30DPD 1 Missed 5.42 3.86 36,345.48 97,378.17
2 Missed 55.75 46.11 473,204.30 1,150,866.42
3 Missed 38.50 33.43 452,324.33 998,917.02
4-5 Missed 46.67 38.26 529,012.51 1,327,705.85
6-12 Missed 59.56 59.29 705,956.78 1,840,849.20不确定如何使用multiIndexes的全部功能来访问数据和添加逻辑过滤器。
我尝试了下面的方法,但没有效果。
summary['Price_diff'] = np.nan
for idx in summary.index.levels[0]:
summary.Price_diff[idx] = summary.Price[idx].diff()我怎样才能做到这一点?
发布于 2021-12-02 11:43:37
你的问题还不清楚,但你正在寻找这样的东西吗?
>>> df.groupby(['Bucket', 'Salary']).diff()
Price Cost Value Gross
Date Bucket Salary
2021-10-31 30DPD 2 Missed NaN NaN NaN NaN
3 Missed NaN NaN NaN NaN
4-5 Missed NaN NaN NaN NaN
6-12 Missed NaN NaN NaN NaN
2021-11-30 30DPD 1 Missed NaN NaN NaN NaN
2 Missed 52.42 43.04 380695.65 997373.36
3 Missed 36.59 30.74 426362.73 927587.51
4-5 Missed 45.99 37.42 515193.20 1298946.49
6-12 Missed 59.27 58.29 705551.45 1833939.51将MRE设置为可重现:
import pandas as pd
from pandas import Timestamp
data = {'Date': [Timestamp('2021-10-31 00:00:00'), Timestamp('2021-10-31 00:00:00'), Timestamp('2021-10-31 00:00:00'), Timestamp('2021-10-31 00:00:00'), Timestamp('2021-11-30 00:00:00'), Timestamp('2021-11-30 00:00:00'), Timestamp('2021-11-30 00:00:00'), Timestamp('2021-11-30 00:00:00'), Timestamp('2021-11-30 00:00:00')],
'Bucket': ['30DPD', '30DPD', '30DPD', '30DPD', '30DPD', '30DPD', '30DPD', '30DPD', '30DPD'],
'Salary': ['2 Missed', '3 Missed', '4-5 Missed', '6-12 Missed', '1 Missed', '2 Missed', '3 Missed', '4-5 Missed', '6-12 Missed'],
'Price': [3.33, 1.91, 0.68, 0.29, 5.42, 55.75, 38.5, 46.67, 59.56],
'Cost': [3.07, 2.69, 0.84, 1.0, 3.86, 46.11, 33.43, 38.26, 59.29],
'Value': [92508.65, 25961.6, 13819.31, 405.33, 36345.48, 473204.3, 452324.33, 529012.51, 705956.78],
'Gross': [153493.06, 71329.51, 28759.36, 6909.69, 97378.17, 1150866.42, 998917.02, 1327705.85, 1840849.2]}
df = pd.DataFrame(data).set_index(['Date', 'Bucket', 'Salary'])https://stackoverflow.com/questions/70197883
复制相似问题