我有一本字典
test = {
"A": [
{
"sourceName": "MongoDB",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "Dynamo",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "Dynamo",
"Date": "2020-12-09T00:00:00.000Z"
}
],
"B": [
{
"sourceName": "MongoDB",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "SQL",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "Dynamo",
"Date": "2020-11-10T00:00:00.000Z"
}
]
};我正在尝试删除列表中的字典,其中键'sourceName‘已经发生了多次,而键' date ’有较新的日期作为值。
例如,在上面给定的场景中,对于键"A",存在多个Dynamo实例,因此字典元素
{
"sourceName": "Dynamo",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "Dynamo",
"Date": "2020-12-09T00:00:00.000Z"
}最终的结果字典应该只包含
{
"sourceName": "Dynamo",
"Date": "2020-11-10T00:00:00.000Z"
}两个人中的一个。最后的结果应该是这样的。
{
"A": [
{
"sourceName": "MongoDB",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "Dynamo",
"Date": "2020-11-10T00:00:00.000Z"
}
],
"B": [
{
"sourceName": "MongoDB",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "SQL",
"Date": "2020-11-10T00:00:00.000Z"
},
{
"sourceName": "Dynamo",
"Date": "2020-11-10T00:00:00.000Z"
}
]
};发布于 2020-11-11 19:45:45
您可以使用熊猫来排序和删除副本,然后将其转换回字典。
例:
import pandas as pd
test_new = {}
for t in test:
test_new[t]=pd.DataFrame(test[t]).sort_values('Date').drop_duplicates('sourceName', keep="first").to_dict('records')
print(test_new)说明:对于每个类别,将字典列表转换为DataFrame,按日期排序,然后删除重复的sourceNames,保持较早的日期。然后将DataFrame转换回字典列表。
产出:
{'A': [{'sourceName': 'MongoDB', 'Date': '2020-11-10T00:00:00.000Z'},
{'sourceName': 'Dynamo', 'Date': '2020-11-10T00:00:00.000Z'}],
'B': [{'sourceName': 'MongoDB', 'Date': '2020-11-10T00:00:00.000Z'},
{'sourceName': 'SQL', 'Date': '2020-11-10T00:00:00.000Z'},
{'sourceName': 'Dynamo', 'Date': '2020-11-10T00:00:00.000Z'}]}https://stackoverflow.com/questions/64792602
复制相似问题