i coming pandas sql background, have used partition function aggregate 2 columns different level.
here dataframe
teamname playerid playerlevel 1 beginner 2 beginner 3 intermediate 4 intermediate 5 intermediate 6 advanced b 7 beginner b 8 beginner b 9 advanced b 10 intermediate b 11 beginner b 12 advanced
i want count players fall each player level, can using
.groupby(['teamname', 'playerlevel'], as_index=false) \ .agg({'playerid': 'count'})
this gets me this
teamname playerlevel playerid beginner 2 intermediate 3 advanced 1 b beginner 3 b intermediate 1 b advanced 2
but want "denominator", total number of players each team. example data frame (with columns renamed) (the denominator happens 6 both teams in example).
teamname playerlevel numerator denominator beginner 2 6 intermediate 3 6 advanced 1 6 b beginner 3 6 b intermediate 1 6 b advanced 2 6
but can't figure out how multiple grouby aggregations play nicely together.
per @root's suggestion , motivated @jeff's comments
happens equivalent @maxu's answer if looks different.
df1 = df.groupby(['teamname', 'playerlevel']).size().to_frame('numerator') df1['denominator'] = df1.groupby(level='teamname').transform(sum) df1
old answer
numerator = df.groupby(['teamname', 'playerlevel']).size().rename('numerator') numerator teamname playerlevel advanced 1 beginner 2 intermediate 3 b advanced 2 beginner 3 intermediate 1 name: numerator, dtype: int64
denominator = df.groupby(['teamname']).size().rename('denominator') denominator teamname 6 b 6 name: denominator, dtype: int64
numerator.to_frame().merge(denominator.to_frame(), right_index=true, left_index=true)
df.groupby(['teamname', 'playerlevel']).size().unstack() \ .div(df.groupby(['teamname']).size(), axis=0)
Comments
Post a Comment