首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >熊猫群在系列上工作,但没有选择完整的数据

熊猫群在系列上工作,但没有选择完整的数据
EN

Stack Overflow用户
提问于 2015-01-29 09:13:35
回答 1查看 755关注 0票数 1

想了解这种行为。

我有一个Dataframe holdings,它有各种各样的列,例如

代码语言:javascript
复制
[u'date', u'portfolio', u'sector', u'industry', u'instrument', u'name', u'position', u'price', u'pct_chg', u'mv']

其中mv是市场价值。

当我这么做

代码语言:javascript
复制
holdings['wt'] = holdings.groupby(['holdings.portfolio','holdings.date']).apply(lambda x: x['mv']/sum(x['mv']) )

我知道错误了

代码语言:javascript
复制
/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in reindexer(value)
   2234 
   2235                     # other
-> 2236                     raise TypeError('incompatible index of inserted column '
   2237                                     'with frame index')
   2238             return value

TypeError: incompatible index of inserted column with frame index

但当我做的时候

代码语言:javascript
复制
holdings['wt'] = holdings['mv'].groupby([holdings['holdings.portfolio'],holdings['holdings.date']]).apply(lambda x: x/sum(x) )

效果很好。

对我来说前者看上去更整洁。我的编码是错的还是这是意料之中的?谢谢

CSV数据转储如下:

代码语言:javascript
复制
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释 
* ',holdings.date,holdings.portfolio,static_data.sector,static_data.industry,holdings.instrument,static_data.name,holdings.position,prices.adjclose,pct_chg,mv\n0,2013-01-14 00:00:00,SP500,Health Care,Health Care Equipment & Services,A,Agilent Technologies Inc,333512000.0,30.61,0.0026203734032099746,10208802320.0\n20072,2013-01-14 00:00:00,SP500,Consumer Discretionary,"Apparel, Accessories & Luxury Goods",RL,Polo Ralph Lauren Corp.,87704000.0,163.35,0.002454740718011772,14326448400.0\n3432,2013-01-14 00:00:00,SP500,Information Technology,Semiconductors,BRCM,Broadcom Corporation,592000000.0,33.74,-0.005599764220453829,19974080000.0\n20020,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Drilling,RIG,Transocean,362189000.0,49.65,-0.0028118096003213466,17982683850.0\n19968,2013-01-14 00:00:00,SP500,Information Technology,Systems Software,RHT,Red Hat Inc.,187822000.0,54.99,0.009917355371900749,10328331780.0\n3484,2013-01-14 00:00:00,usequity,Health Care,Health Care Equipment & Services,BSX,Boston Scientific,849000.0,6.32,-0.0062893081761006275,5365680.0\n19916,2013-01-14 00:00:00,usequity,Industrials,Industrial Conglomerates,RHI,Robert Half International,60000.0,32.28,0.011278195488721776,1936800.0\n3536,2013-01-14 00:00:00,SP500,Consumer Discretionary,Auto Parts & Equipment,BWA,BorgWarner,227373000.0,35.57,0.003668171557562161,8087657610.0\n19864,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,RF,Regions Financial Corp.,1379000000.0,7.06,-0.007032348804500765,9735740000.0\n19812,2013-01-14 00:00:00,SP500,Health Care,Biotechnology,REGN,Regeneron,100390000.0,179.4,-0.00033433634236046395,18009966000.0\n3588,2013-01-14 00:00:00,SP500,Financials,REITs,BXP,Boston Properties,153099000.0,100.68,0.003388479170819192,15414007320.000002\n19760,2013-01-14 00:00:00,SP500,Consumer Staples,Tobacco,RAI,Reynolds American Inc.,531283000.0,39.13,0.0017921146953405742,20789103790.0\n19708,2013-01-14 00:00:00,SP500,Industrials,Industrial Conglomerates,R,Ryder System,53039000.0,51.47,0.0027274498344047604,2729917330.0\n3640,2013-01-14 00:00:00,SP500,Financials,Banks,C,Citigroup Inc.,3029500000.0,42.15,-0.002838892831795725,127693425000.0\n19656,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Exploration & Production,QEP,QEP Resources,180091000.0,29.17,-0.004776526782667934,5253254470.0\n3692,2013-01-14 00:00:00,SP500,Information Technology,Systems Software,CA,"CA, Inc.",444906000.0,22.19,0.009554140127388644,9872464140.0\n19604,2013-01-14 00:00:00,SP500,Information Technology,Semiconductors,QCOM,QUALCOMM Inc.,1676023000.0,62.05,-0.010208964747168592,103997227150.0\n19552,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Exploration & Production,PXD,Pioneer Natural Resources,143098000.0,111.63,-0.0009844281367460406,15974029740.0\n3744,2013-01-14 00:00:00,SP500,Consumer Staples,Packaged Foods & Meats,CAG,ConAgra Foods Inc.,424827000.0,29.21,0.0075888237323216146,12409196670.0\n19500,2013-01-14 00:00:00,SP500,Materials,Industrial Gases,PX,Praxair Inc.,291372000.0,110.15,0.0009086778736937529,32094625800.0\n19448,2013-01-14 00:00:00,SP500,Industrials,Industrial Conglomerates,PWR,Quanta Services Inc.,216795000.0,28.66,-0.012405237767057153,6213344700.0\n3796,2013-01-14 00:00:00,SP500,Health Care,Health Care Distributors & Services,CAH,Cardinal Health Inc.,336000000.0,41.62,0.003133285128946728,13984320000.0\n19396,2013-01-14 00:00:00,SP500,Consumer Discretionary,"Apparel, Accessories & Luxury Goods",PVH,PVH Corp.,82393000.0,117.49,0.002303361201160259,9680353570.0\n3848,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Equipment & Services,CAM,Cameron International Corp.,198303000.0,57.44,-0.0019113814074717128,11390524320.0\n19344,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Refining & Marketing & Transportation,PSX,Phillips 66,553513000.0,49.42,0.015409903431271799,27354612460.0\n19292,2013-01-14 00:00:00,SP500,Financials,REITs,PSA,Public Storage,172418000.0,139.16,-0.005005005005005114,23993688880.0\n20124,2013-01-14 00:00:00,SP500,Industrials,Industrial Conglomerates,ROK,Rockwell Automation Inc.,137872000.0,82.65,-0.0018115942028984477,11395120800.0\n3900,2013-01-14 00:00:00,SP500,Industrials,Construction & Farm Machinery & Heavy Trucks,CAT,Caterpillar Inc.,611500000.0,90.32,-0.005943209333039934,55230679999.99999\n3380,2013-01-14 00:00:00,SP500,Health Care,Health Care Distributors & Services,BMY,Bristol-Myers Squibb,1658776000.0,32.49,0.00277777777777799,53893632240.0\n3328,2013-01-14 00:00:00,SP500,Materials,Paper Packaging,BMS,Bemis Company,99880000.0,33.34,0.008469449485783542,3329999200.0000005\n21008,2013-01-14 00:00:00,SP500,Consumer Discretionary,Broadcasting & Cable TV,SNI,Scripps Networks Interactive Inc.,140122000.0,58.2,-0.011381009002887632,8155100400.0\n2860,2013-01-14 00:00:00,SP500,Consumer Discretionary,Computer & Electronics Retail,BBY,Best Buy Co. Inc.,349615000.0,13.9,0.019061583577712593,4859648500.0\n20956,2013-01-14 00:00:00,SP500,Information Technology,Computer Storage & Peripherals,SNDK,SanDisk Corporation,222201000.0,46.04,0.008985316677624366,10230134040.0\n20904,2013-01-14 00:00:00,SP500,Consumer Discretionary,Household Appliances,SNA,Snap-On Inc.,58107000.0,77.47,0.0014219234746639664,4501549290.0\n2912,2013-01-14 00:00:00,SP500,Health Care,Health Care Equipment & Services,BCR,Bard (C.R.) Inc.,74898000.0,101.28,-0.004423473901503994,7585669440.0\n20852,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Equipment & Services,SLB,Schlumberger Ltd.,1286793000.0,70.8,-0.01324041811846699,91104944400.0\n2964,2013-01-14 00:00:00,SP500,Health Care,Health Care Equipment & Services,BDX,Becton Dickinson,191835000.0,79.49,0.006584779030011312,15248964149.999998\n20800,2013-01-14 00:00:00,SP500,Consumer Staples,Packaged Foods & Meats,SJM,Smucker (J.M.),101817000.0,84.88,0.0009433962264151496,8642226960.0\n20748,2013-01-14 00:00:00,SP500,Materials,Diversified Chemicals,SIAL,Sigma-Aldrich,119085000.0,75.15,0.0009323388385722442,8949237750.0\n3016,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,BEN,Franklin Resources,622900000.0,44.54,-0.0006730984967466824,27743966000.0\n20696,2013-01-14 00:00:00,SP500,Materials,Specialty Chemicals,SHW,Sherwin-Williams,95997000.0,158.08,-0.0006321911746112185,15175205760.000002\n20644,2013-01-14 00:00:00,SP500,Materials,Paper Packaging,SEE,Sealed Air Corp.(New),210399000.0,17.77,0.006228765571913986,3738790230.0\n3068,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Equipment & Services,BHI,Baker Hughes Inc,432598000.0,41.06,-0.023078753271472685,17762473880.0\n20592,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Refining & Marketing & Transportation,SE,Spectra Energy Corp.,670893000.0,25.99,0.0034749034749035346,17436509070.0\n3120,2013-01-14 00:00:00,SP500,Health Care,Biotechnology,BIIB,BIOGEN IDEC Inc.,236155000.0,143.88,0.0006259127894847616,33977981400.0\n20540,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,SCHW,Charles Schwab Corporation,1303355000.0,14.95,-0.0073041168658699585,19485157250.0\n20488,2013-01-14 00:00:00,SP500,Utilities,Multi-Utilities & Unregulated Power,SCG,SCANA Corp,142052000.0,43.04,-0.003934274473501587,6113918080.0\n3172,2013-01-14 00:00:00,SP500,Financials,Banks,BK,The Bank of New York Mellon Corp.,1125709000.0,25.71,-0.002328288707799664,28941978390.0\n20436,2013-01-14 00:00:00,SP500,Consumer Discretionary,Restaurants,SBUX,Starbucks Corp.,749500000.0,53.37,-0.006330292310556707,40000815000.0\n3224,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,BLK,BlackRock,167610000.0,212.71,0.005340769448908267,35652323100.0\n'
*/
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-01-29 09:57:12

好吧,看看你尝试过的:

代码语言:javascript
复制
holdings['wt'] = holdings.groupby(['holdings.portfolio','holdings.date']).apply(lambda x: x['mv']/sum(x['mv']) )

这会失败,因为您在这里分组时减少了行数,但试图将其分配回原始df,并且索引不再兼容。

如果要将某个groupby操作的结果分配回原始df,则应该调用transform

代码语言:javascript
复制
In [174]:

holdings['wt'] = holdings.groupby(['holdings.portfolio','holdings.date'])['mv'].transform(lambda x: x/sum(x))
holdings['wt']

Out[174]:
0        0.009482
20072    0.013306
3432     0.018552
20020    0.016702
19968    0.009593
3484     0.734775
19916    0.265225
3536     0.007512
19864    0.009043
19812    0.016728
3588     0.014317
19760    0.019309
19708    0.002536
3640     0.118602
19656    0.004879
3692     0.009170
19604    0.096593
19552    0.014837
3744     0.011526
19500    0.029810
19448    0.005771
3796     0.012989
19396    0.008991
3848     0.010580
19344    0.025407
19292    0.022285
20124    0.010584
3900     0.051298
3380     0.050057
3328     0.003093
21008    0.007574
2860     0.004514
20956    0.009502
20904    0.004181
2912     0.007046
20852    0.084619
2964     0.014163
20800    0.008027
20748    0.008312
3016     0.025769
20696    0.014095
20644    0.003473
3068     0.016498
20592    0.016195
3120     0.031559
20540    0.018098
20488    0.005679
3172     0.026881
20436    0.037153
3224     0.033114
Name: wt, dtype: float64

你做的另一件事真的有点奇怪:

代码语言:javascript
复制
holdings['wt'] = holdings['mv'].groupby([holdings['holdings.portfolio'],holdings['holdings.date']]).apply(lambda x: x/sum(x) )

不是传递列名,而是传递一个2系列的列表,并在列'mv‘上调用它(这是一个序列),这不会创建分组,因为没有要分组的列,因为强制它返回一个具有与原始df兼容的索引的序列。

我们可以测试我的transform方法是否与您的最后一个方法相同:

代码语言:javascript
复制
In [178]:

holdings['wt'].equals(holdings['mv'].groupby([holdings['holdings.portfolio'],holdings['holdings.date']]).apply(lambda x: x/sum(x) ))
Out[178]:
True
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/28210821

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档