Search code examples
pythonpandasskewkurtosis

Calculation is done only on part of the table


I am trying to calculate the kurtosis and skewness over a data and I managaed to create table but for some reason teh result is only for few columns and not for the whole fields.

For example, as you cann see, I have many fields (columns): enter image description here

I calculate the skenwess and kurtosis using the next code:

sk=pd.DataFrame(data.skew())
kr=pd.DataFrame(data.kurtosis())
sk['kr']=kr


sk.rename(columns ={0: 'sk'}, inplace =True)

but then I get result that contains about half of the data I have:

enter image description here

I have tried to do head(10) but it doesn't change the fact that some columns dissapeard.

How can I calculte this for all the columns?


Solution

  • It is really hard to reproduce the error since you did not give the original data. Probably your dataframe contains non-numerical values in the missing columns which would result in this behavior.

     dat = {"1": {'lg1':0.12, 'lg2':0.23, 'lg3':0.34, 'lg4':0.45},
    "2":{'lg1':0.12, 'lg2':0.23, 'lg3':0.34, 'lg4':0.45}, 
    "3":{'lg1':0.12, 'lg2':0.23, 'lg3':0.34, 'lg4':0.45}, 
    "4":{'lg1':0.12, 'lg2':0.23, 'lg3':0.34, 'lg4':0.45}, 
    "5":{'lg1':0.12, 'lg2':0.23, 'lg3': 'po', 'lg4':0.45}}
    
     df = pd.DataFrame.from_dict(dat).T
    
     print(df)
        lg1   lg2   lg3   lg4
     1  0.12  0.23  0.34  0.45
     2  0.12  0.23  0.34  0.45
     3  0.12  0.23  0.34  0.45
     4  0.12  0.23  0.34  0.45
     5  0.12  0.23    po  0.45
    
     print(df.kurtosis())
     lg1    0
     lg2    0
     lg4    0
    

    The solution would be to preprocess the data.

    One word of advice would be to check for consistency in the error, i.e. are always the same lines missing?