文章/答案/技术大牛

发布

社区首页 >问答首页 >熊猫:删除每年不存在的ID行

问熊猫:删除每年不存在的ID行
EN

Stack Overflow用户

提问于 2020-11-15 01:57:44

回答 1查看 109关注 0票数 0

我的数据集包含了2009年至2019年的银行信息，但一些银行在此过程中被合并/收购或关闭，因此我想删除2009年至2019年期间不存在的任何银行。例如，ID 32和56在2019年不存在，因此应该删除它们。以下是我的数据的样子：

ID  Assets  Year
32    10    2009
45    5     2009
56    24    2009
78    9     2009
32    11    2010
45    6     2010
56    31    2010
78    14    2010
...   ...   ...
32    11    2018
45    13    2018
78    14    2018
45    13    2019
78    3     2019

从2009年到2019年，只有ID 45和78存在，所以其他的都应该删除。下面是它应该是什么样子：

ID  Assets  Year
45    5     2009
78    9     2009
45    6     2010
78    14    2010
...   ...   ...
45    13    2018
78    14    2018
45    13    2019
78    3     2019

python

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-11-15 03:26:21

假设您有一个已关闭的银行ids列表：

closed = [56, 32]

df[~df['ID'].isin(closed)]

相反，假设您有一个现有银行的列表：

opened = [45, 78]

df[df['ID'].isin(opened)]

基于问题澄清的编辑

数据：

df = pd.DataFrame({'Assets': {0: 10, 1: 5, 2: 24, 3: 9, 4: 11, 5: 6, 6: 31, 7: 14, 8: 11, 9: 13, 10: 14, 11: 13, 12: 3},
 'ID': {0: 32, 1: 45, 2: 56, 3: 78, 4: 32, 5: 45, 6: 56, 7: 78, 8: 32, 9: 45, 10: 78, 11: 45, 12: 78},
 'Year': {0: 2009, 1: 2009, 2: 2009, 3: 2009, 4: 2010, 5: 2010, 6: 2010, 7: 2010, 8: 2018, 9: 2018, 10: 2018, 11: 2019, 12: 2019}})

如果您需要获得每年可用的in列表：

## number of unique years each id needs to have:
year_count = len(df['Year'].unique())

## get number of unique years that each id has:
id_year_count = df[['Year','ID']].groupby(['ID', 'Year']).count().reset_index().groupby('ID').count().reset_index()

## filter to get the list of ids that match the condition:
opened_every_year = id_year_count['ID'][id_year_count['Year']==year_count].tolist()

## pass the list to the df to filter as described before
df = df[df['ID'].isin(opened_every_year)]

检查：

df.sort_values(['ID', 'Year'])

产出：

    Assets  ID  Year
1        5  45  2009
5        6  45  2010
9       13  45  2018
11      13  45  2019
3        9  78  2009
7       14  78  2010
10      14  78  2018
12       3  78  2019

(请注意，数据样本只有4年的数据，因此它选择了所有4年的it )

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64840519

复制

相似问题

问熊猫:删除每年不存在的ID行
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫:删除每年不存在的ID行EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫:删除每年不存在的ID行
EN