python - Pandas: Get the only value of a series or nan if it does not exist -

i'm doing bit more complex operation on dataframe compare 2 rows can anywhere in frame.

here's example:

import pandas pd import numpy np  d = {'a':['a','a','c','e','e','b','b'],'b':['c','f','a','b','d','a','e']\  ,'aw':[1,2,3,4,5,6,7],'bw':[10,20,30,40,50,60,70]} p = pd.dataframe(d) p = p.sort_values(['a','b']) p['ab'] = p.a+'_'+p.b p['awbw'] = p.aw+p.bw

now doing here have pairings of strings in a , b, example a_c call ab. , have reverse pairing c_a well. sum on numbers aw , bw each pairing, called awbw.

now want subtract summed value of a_c value of c_a , same thing every string pairing both variants exist. other values should nan, result should this:

    aw  b  bw   ab  awbw  rowdelta 0    1  c  10  a_c    11     -22.0 1    2  f  20  a_f    22       nan 5  b   6   60  b_a    66       nan 6  b   7  e  70  b_e    77      33.0 2  c   3   30  c_a    33      22.0 3  e   4  b  40  e_b    44     -33.0 4  e   5  d  50  e_d    55       nan

i have solved way this, there's 1 problem left i'm stuck at.

here's solution far:

for i,row in p.iterrows():   p.ix[i,'rowdelta'] = row['awbw']\    - p[(p['a'] == row.ab[2]) & (p['b'] == row.ab[0])]['awbw'].get(0,np.nan)

the problem p[(p['a'] == row.ab[2]) & (p['b'] == row.ab[0])]['awbw'] returns series either empty or has 1 element index variable.

now series.get method solves problem of returning nan when series empty wants definitive index value, in case use 0, can not dynamic index there.

i can not example

t = p[(p['a'] == row.ab[2]) & (p['b'] == row.ab[0])]['awbw'] t.get(t.index[0],np.nan)

because there no index if series empty , leads error when doing t.index[0]. same goes attempts using iloc.

is there way dynamically unknown 1 index of series if has 1 element (and never more one) while @ same time handling case of empty series?

credit goes pirsquared pointing me right direction solution:

ab = p.ab.str.split('_', expand=true) ab = ab.merge(ab, left_on=[0, 1], right_on=[1, 0],how='inner')[[0,1]] ab = ab.merge(p,left_on=[0,1], right_on=['a','b'])[['a','aw','b','bw']] ab = ab.merge(p,left_on=['a','b'], right_on=['b','a'])[['aw_x','bw_x','aw_y','bw_y','ab']] ab['rowdelta'] = ab.aw_y+ab.bw_y-ab.aw_x-ab.bw_x p = p.merge(ab[['ab','rowdelta']],on='ab',how='outer')

maybe can made shorter or nicer, works sure.

Thr

Search This Blog

python - Pandas: Get the only value of a series or nan if it does not exist -

Comments

Post a Comment