import pandas as pd
import statistics as st
def median_1(table):
print(table.median())
def median_2(table):
print(st.median(table))
# Reading the excel file and sorting the value according to the X column
file=pd.read_excel("C:\\Users\\hp\\Desktop\\alcohol.xls").sort_values("X")
#Forming the new index using list comprehension
index_row=[i+1 for i in range(len(file))]
#making the new index compatible
index_new=pd.Index(index_row)
#Extracting the column named X and converting it into dataframe
column_df=pd.DataFrame(file.loc[:,"X"])
#setting the new index
new=column_df.set_index(index_new)
median_1(new)
median_2(new)
Median_1 is returning column name and the median values, but it should be returning only the median value.
The median_2 function is not returning the median value, it is just returning the name of the column.
Output:
runfile('C:/Users/hp/Desktop/eg.py', wdir='C:/Users/hp/Desktop')
X 562.5
dtype: float64
X
st.median() takes a list not a data frame as input. Since new
is a data frame, it does not work. You could specify the column when you pass the parameter.
median_2(new['X'])
# this will give you the median value without the column name
562.5
The same will also work for df.median()
also as in your median_1
function.
median_1(new['X'])
# this will also give you the median value without the column name
562.5