文章/答案/技术大牛

发布

社区首页 >问答首页 >映射列值的子字符串

问映射列值的子字符串
EN

Stack Overflow用户

提问于 2020-06-26 09:09:42

回答 2查看 47关注 0票数 0

我想添加一个列"Region“，其中邮政编码的前两位数可以归因于一个区域：

Region:A

Zip_code：

Zip_code: 11 = 22 = Region:B
Zip_code: 44 = Region:C

主表

Zip_code      product_id
110034        55454
114242        45445
113564        46454
223434        53533
224535        56455
223435        63535
444345        62435
443535        24353

输出表

Zip_code      product_id   Region
110034        55454        A
114242        45445        A 
113564        46454        A
223434        53533        B
224535        56455        B
223435        63535        B
444345        62435        C
443535        24353        C

python

pandas

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-06-26 09:13:52

您可以将Zip_codes切片并使用字典映射它们：

df['Region'] = df.Zip_code.astype(str).str[:2].map({'11':'A', '22':'B', '44':'C'})

print(df)
   Zip_code  product_id Region
0    110034       55454      A
1    114242       45445      A
2    113564       46454      A
3    223434       53533      B
4    224535       56455      B
5    223435       63535      B
6    444345       62435      C
7    443535       24353      C

票数 3

Stack Overflow用户

发布于 2020-06-26 09:14:01

你可以这样做：

import pandas as pd


#map between first-digits in Zip-Code and Region
regions_map = {11: "A", 22:"B", 44:"C"}

df["Region"] = df["Zip_code"].apply(lambda x: regions_map[int(str(x)[:2])])

print(df)
#   Zip_code  product_id Region
#0    110034       55454      A
#1    114242       45445      A
#2    113564       46454      A
#3    223434       53533      B
#4    224535       56455      B
#5    223435       63535      B
#6    444345       62435      C
#7    443535       24353      C

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/62591549

复制

相似问题

问映射列值的子字符串
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问映射列值的子字符串EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问映射列值的子字符串
EN