Search code examples
pythonmachine-learningscipydistribution

Python Error: rv_generic.interval() missing 1 required positional argument: 'confidence'


I have been trying to run the below code to calculate upper and lower confidence intervals using t distribution, but it keeps throwing the error in the subject. The piece of code is as below:

def trans_threshold(Day):
    Tran_Cnt=Tran_Cnt_DF[['Sample',Day]].dropna()
    Tran_Cnt=Tran_Cnt.astype({'Sample':'str'})
    Tran_Cnt.dtypes
    #Finding outliers in Materiality via IQR
    X_Tran = Tran_Cnt.drop('Sample', axis=1)
    Tran_arr1 = X_Tran.values
    #Finding the first quartile
    Tran_q1= np.quantile(Tran_arr1, 0.25)
    # finding the 3rd quartile
    Tran_q3 = np.quantile(Tran_arr1, 0.75)
    # finding the iqr region
    Tran_iqr = Tran_q3-Tran_q1
    # finding upper and lower outliers
    Tran_upper_bound = Tran_q3+(1.5*Tran_iqr)
    Tran_lower_bound = Tran_q1-(1.5*Tran_iqr)
    # removing outliers
    Tran_arr2 = Tran_arr1[(Tran_arr1 >= Tran_lower_bound) & (Tran_arr1 <= Tran_upper_bound)]
    #Using t distribution for Materiality Limits
    Tran_Threshold_mat=st.t.interval(alpha=0.99999999999, df=len(Tran_arr2-1),
             loc=np.mean(Tran_arr2),
             scale=st.sem(Tran_arr2))
    return Tran_Threshold_mat



trn_lim_FullFeed_Mon = trans_threshold(Day) 

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[106], line 19
     17 Tran_arr2 = Tran_arr1[(Tran_arr1 >= Tran_lower_bound) & (Tran_arr1 <= Tran_upper_bound)]
     18     #Using t distribution for Materiality Limits
---> 19 Tran_Threshold_mat=st.t.interval(alpha=0.99999999999, df=len(Tran_arr2-1),
     20                                  loc=np.mean(Tran_arr2),
     21                                  scale=st.sem(Tran_arr2))

TypeError: rv_generic.interval() missing 1 required positional argument: 'confidence'

The issue seems to be with piece of code below. However, I have provided all parameters required to calculate confidence intervals, including degrees of freedom, but it still gives this error. Where am I going wrong and what needs to be done?

Tran_Threshold_mat=st.t.interval(alpha=0.99999999999, df=len(Tran_arr2-1),
                                 loc=np.mean(Tran_arr2),
                                 scale=st.sem(Tran_arr2))

Also, the Tran_arr2 list looks like below:

array([12617., 12000.,  1123.,   537.,  8605.,  4365., 11292., 12231.,
        7640.,  9583.,  9257., 13864., 14682., 11744., 10501.,  8694.,
        5327., 10066., 13022., 11092.,  7444., 11658., 14920., 12849.,
       14681.,  5719., 11029.,  3814., 14703.,  5593.,  9772.,  8851.,
        9551., 15975.,  6532., 13827.,  8547.])

Hence, there is no issue, up until the last like of the code block which estimates confidence intervals using t distribution.

I have used the below packages:

import pandas as pd
import numpy as np
import scipy.stats as st
import matplotlib.pyplot as plt
import matplotlib.ticker as tkr
import matplotlib.scale as mscale
from matplotlib.ticker import FixedLocator, NullFormatter
pd.options.display.float_format = '{:.0f}'.format
pd.options.mode.chained_assignment = None

Solution

  • Note that the signature of scipy.stats.t is interval(confidence, df, loc=0, scale=1). There is no alpha keyword, pass it as positional or relabel it to confidence.