想了解这种行为。
我有一个Dataframe holdings,它有各种各样的列,例如
[u'date', u'portfolio', u'sector', u'industry', u'instrument', u'name', u'position', u'price', u'pct_chg', u'mv']其中mv是市场价值。
当我这么做
holdings['wt'] = holdings.groupby(['holdings.portfolio','holdings.date']).apply(lambda x: x['mv']/sum(x['mv']) )我知道错误了
/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in reindexer(value)
2234
2235 # other
-> 2236 raise TypeError('incompatible index of inserted column '
2237 'with frame index')
2238 return value
TypeError: incompatible index of inserted column with frame index但当我做的时候
holdings['wt'] = holdings['mv'].groupby([holdings['holdings.portfolio'],holdings['holdings.date']]).apply(lambda x: x/sum(x) )效果很好。
对我来说前者看上去更整洁。我的编码是错的还是这是意料之中的?谢谢
CSV数据转储如下:
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释
* ',holdings.date,holdings.portfolio,static_data.sector,static_data.industry,holdings.instrument,static_data.name,holdings.position,prices.adjclose,pct_chg,mv\n0,2013-01-14 00:00:00,SP500,Health Care,Health Care Equipment & Services,A,Agilent Technologies Inc,333512000.0,30.61,0.0026203734032099746,10208802320.0\n20072,2013-01-14 00:00:00,SP500,Consumer Discretionary,"Apparel, Accessories & Luxury Goods",RL,Polo Ralph Lauren Corp.,87704000.0,163.35,0.002454740718011772,14326448400.0\n3432,2013-01-14 00:00:00,SP500,Information Technology,Semiconductors,BRCM,Broadcom Corporation,592000000.0,33.74,-0.005599764220453829,19974080000.0\n20020,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Drilling,RIG,Transocean,362189000.0,49.65,-0.0028118096003213466,17982683850.0\n19968,2013-01-14 00:00:00,SP500,Information Technology,Systems Software,RHT,Red Hat Inc.,187822000.0,54.99,0.009917355371900749,10328331780.0\n3484,2013-01-14 00:00:00,usequity,Health Care,Health Care Equipment & Services,BSX,Boston Scientific,849000.0,6.32,-0.0062893081761006275,5365680.0\n19916,2013-01-14 00:00:00,usequity,Industrials,Industrial Conglomerates,RHI,Robert Half International,60000.0,32.28,0.011278195488721776,1936800.0\n3536,2013-01-14 00:00:00,SP500,Consumer Discretionary,Auto Parts & Equipment,BWA,BorgWarner,227373000.0,35.57,0.003668171557562161,8087657610.0\n19864,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,RF,Regions Financial Corp.,1379000000.0,7.06,-0.007032348804500765,9735740000.0\n19812,2013-01-14 00:00:00,SP500,Health Care,Biotechnology,REGN,Regeneron,100390000.0,179.4,-0.00033433634236046395,18009966000.0\n3588,2013-01-14 00:00:00,SP500,Financials,REITs,BXP,Boston Properties,153099000.0,100.68,0.003388479170819192,15414007320.000002\n19760,2013-01-14 00:00:00,SP500,Consumer Staples,Tobacco,RAI,Reynolds American Inc.,531283000.0,39.13,0.0017921146953405742,20789103790.0\n19708,2013-01-14 00:00:00,SP500,Industrials,Industrial Conglomerates,R,Ryder System,53039000.0,51.47,0.0027274498344047604,2729917330.0\n3640,2013-01-14 00:00:00,SP500,Financials,Banks,C,Citigroup Inc.,3029500000.0,42.15,-0.002838892831795725,127693425000.0\n19656,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Exploration & Production,QEP,QEP Resources,180091000.0,29.17,-0.004776526782667934,5253254470.0\n3692,2013-01-14 00:00:00,SP500,Information Technology,Systems Software,CA,"CA, Inc.",444906000.0,22.19,0.009554140127388644,9872464140.0\n19604,2013-01-14 00:00:00,SP500,Information Technology,Semiconductors,QCOM,QUALCOMM Inc.,1676023000.0,62.05,-0.010208964747168592,103997227150.0\n19552,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Exploration & Production,PXD,Pioneer Natural Resources,143098000.0,111.63,-0.0009844281367460406,15974029740.0\n3744,2013-01-14 00:00:00,SP500,Consumer Staples,Packaged Foods & Meats,CAG,ConAgra Foods Inc.,424827000.0,29.21,0.0075888237323216146,12409196670.0\n19500,2013-01-14 00:00:00,SP500,Materials,Industrial Gases,PX,Praxair Inc.,291372000.0,110.15,0.0009086778736937529,32094625800.0\n19448,2013-01-14 00:00:00,SP500,Industrials,Industrial Conglomerates,PWR,Quanta Services Inc.,216795000.0,28.66,-0.012405237767057153,6213344700.0\n3796,2013-01-14 00:00:00,SP500,Health Care,Health Care Distributors & Services,CAH,Cardinal Health Inc.,336000000.0,41.62,0.003133285128946728,13984320000.0\n19396,2013-01-14 00:00:00,SP500,Consumer Discretionary,"Apparel, Accessories & Luxury Goods",PVH,PVH Corp.,82393000.0,117.49,0.002303361201160259,9680353570.0\n3848,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Equipment & Services,CAM,Cameron International Corp.,198303000.0,57.44,-0.0019113814074717128,11390524320.0\n19344,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Refining & Marketing & Transportation,PSX,Phillips 66,553513000.0,49.42,0.015409903431271799,27354612460.0\n19292,2013-01-14 00:00:00,SP500,Financials,REITs,PSA,Public Storage,172418000.0,139.16,-0.005005005005005114,23993688880.0\n20124,2013-01-14 00:00:00,SP500,Industrials,Industrial Conglomerates,ROK,Rockwell Automation Inc.,137872000.0,82.65,-0.0018115942028984477,11395120800.0\n3900,2013-01-14 00:00:00,SP500,Industrials,Construction & Farm Machinery & Heavy Trucks,CAT,Caterpillar Inc.,611500000.0,90.32,-0.005943209333039934,55230679999.99999\n3380,2013-01-14 00:00:00,SP500,Health Care,Health Care Distributors & Services,BMY,Bristol-Myers Squibb,1658776000.0,32.49,0.00277777777777799,53893632240.0\n3328,2013-01-14 00:00:00,SP500,Materials,Paper Packaging,BMS,Bemis Company,99880000.0,33.34,0.008469449485783542,3329999200.0000005\n21008,2013-01-14 00:00:00,SP500,Consumer Discretionary,Broadcasting & Cable TV,SNI,Scripps Networks Interactive Inc.,140122000.0,58.2,-0.011381009002887632,8155100400.0\n2860,2013-01-14 00:00:00,SP500,Consumer Discretionary,Computer & Electronics Retail,BBY,Best Buy Co. Inc.,349615000.0,13.9,0.019061583577712593,4859648500.0\n20956,2013-01-14 00:00:00,SP500,Information Technology,Computer Storage & Peripherals,SNDK,SanDisk Corporation,222201000.0,46.04,0.008985316677624366,10230134040.0\n20904,2013-01-14 00:00:00,SP500,Consumer Discretionary,Household Appliances,SNA,Snap-On Inc.,58107000.0,77.47,0.0014219234746639664,4501549290.0\n2912,2013-01-14 00:00:00,SP500,Health Care,Health Care Equipment & Services,BCR,Bard (C.R.) Inc.,74898000.0,101.28,-0.004423473901503994,7585669440.0\n20852,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Equipment & Services,SLB,Schlumberger Ltd.,1286793000.0,70.8,-0.01324041811846699,91104944400.0\n2964,2013-01-14 00:00:00,SP500,Health Care,Health Care Equipment & Services,BDX,Becton Dickinson,191835000.0,79.49,0.006584779030011312,15248964149.999998\n20800,2013-01-14 00:00:00,SP500,Consumer Staples,Packaged Foods & Meats,SJM,Smucker (J.M.),101817000.0,84.88,0.0009433962264151496,8642226960.0\n20748,2013-01-14 00:00:00,SP500,Materials,Diversified Chemicals,SIAL,Sigma-Aldrich,119085000.0,75.15,0.0009323388385722442,8949237750.0\n3016,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,BEN,Franklin Resources,622900000.0,44.54,-0.0006730984967466824,27743966000.0\n20696,2013-01-14 00:00:00,SP500,Materials,Specialty Chemicals,SHW,Sherwin-Williams,95997000.0,158.08,-0.0006321911746112185,15175205760.000002\n20644,2013-01-14 00:00:00,SP500,Materials,Paper Packaging,SEE,Sealed Air Corp.(New),210399000.0,17.77,0.006228765571913986,3738790230.0\n3068,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Equipment & Services,BHI,Baker Hughes Inc,432598000.0,41.06,-0.023078753271472685,17762473880.0\n20592,2013-01-14 00:00:00,SP500,Energy,Oil & Gas Refining & Marketing & Transportation,SE,Spectra Energy Corp.,670893000.0,25.99,0.0034749034749035346,17436509070.0\n3120,2013-01-14 00:00:00,SP500,Health Care,Biotechnology,BIIB,BIOGEN IDEC Inc.,236155000.0,143.88,0.0006259127894847616,33977981400.0\n20540,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,SCHW,Charles Schwab Corporation,1303355000.0,14.95,-0.0073041168658699585,19485157250.0\n20488,2013-01-14 00:00:00,SP500,Utilities,Multi-Utilities & Unregulated Power,SCG,SCANA Corp,142052000.0,43.04,-0.003934274473501587,6113918080.0\n3172,2013-01-14 00:00:00,SP500,Financials,Banks,BK,The Bank of New York Mellon Corp.,1125709000.0,25.71,-0.002328288707799664,28941978390.0\n20436,2013-01-14 00:00:00,SP500,Consumer Discretionary,Restaurants,SBUX,Starbucks Corp.,749500000.0,53.37,-0.006330292310556707,40000815000.0\n3224,2013-01-14 00:00:00,SP500,Financials,Diversified Financial Services,BLK,BlackRock,167610000.0,212.71,0.005340769448908267,35652323100.0\n'
*/发布于 2015-01-29 09:57:12
好吧,看看你尝试过的:
holdings['wt'] = holdings.groupby(['holdings.portfolio','holdings.date']).apply(lambda x: x['mv']/sum(x['mv']) )这会失败,因为您在这里分组时减少了行数,但试图将其分配回原始df,并且索引不再兼容。
如果要将某个groupby操作的结果分配回原始df,则应该调用transform:
In [174]:
holdings['wt'] = holdings.groupby(['holdings.portfolio','holdings.date'])['mv'].transform(lambda x: x/sum(x))
holdings['wt']
Out[174]:
0 0.009482
20072 0.013306
3432 0.018552
20020 0.016702
19968 0.009593
3484 0.734775
19916 0.265225
3536 0.007512
19864 0.009043
19812 0.016728
3588 0.014317
19760 0.019309
19708 0.002536
3640 0.118602
19656 0.004879
3692 0.009170
19604 0.096593
19552 0.014837
3744 0.011526
19500 0.029810
19448 0.005771
3796 0.012989
19396 0.008991
3848 0.010580
19344 0.025407
19292 0.022285
20124 0.010584
3900 0.051298
3380 0.050057
3328 0.003093
21008 0.007574
2860 0.004514
20956 0.009502
20904 0.004181
2912 0.007046
20852 0.084619
2964 0.014163
20800 0.008027
20748 0.008312
3016 0.025769
20696 0.014095
20644 0.003473
3068 0.016498
20592 0.016195
3120 0.031559
20540 0.018098
20488 0.005679
3172 0.026881
20436 0.037153
3224 0.033114
Name: wt, dtype: float64你做的另一件事真的有点奇怪:
holdings['wt'] = holdings['mv'].groupby([holdings['holdings.portfolio'],holdings['holdings.date']]).apply(lambda x: x/sum(x) )不是传递列名,而是传递一个2系列的列表,并在列'mv‘上调用它(这是一个序列),这不会创建分组,因为没有要分组的列,因为强制它返回一个具有与原始df兼容的索引的序列。
我们可以测试我的transform方法是否与您的最后一个方法相同:
In [178]:
holdings['wt'].equals(holdings['mv'].groupby([holdings['holdings.portfolio'],holdings['holdings.date']]).apply(lambda x: x/sum(x) ))
Out[178]:
Truehttps://stackoverflow.com/questions/28210821
复制相似问题