In SciPy one can implement a beta distribution as follows:
p = scipy.stats.beta.cdf(x, alpha, beta, loc=A, scale=B-A)
Now, suppose I have a Pandas dataframe with the columns x,alpha,beta,A,B. How do I apply the beta distribution to each row, appending the result as a new column?
Given that I suspect that pandas apply is just looping over all rows, and the scipy.stats distributions have quite a bit of overhead in each call, I would use a vectorized version:
>>> from scipy import stats
>>> df['p'] = stats.beta.cdf(df['x'], df['alpha'], df['beta'], loc=df['A'], scale=df['B']-df['A'])
>>> df
A B alpha beta x p
0 0 148000000000 1.501710 628.110247 640495496 0.858060
1 0 148000000000 1.501704 620.110000 640495440 0.853758