I have a pandas.series which consist zip_codes like as: '23123213', '212212', '54545' and some zip_codes looks like: '4344-4343', '7676432-878'. I create def for changing each elements of series to numeric format:
def numbers_zip_code(a): # function for changing type of zip_code from object to number
try:
a = int(a)
except ValueError:
splitted_a = a.split('-')
a = int(splitted_a[0]) + int(splitted_a[1])
return a
I used this function:
X.zip_code = X.zip_code.map(lambda x: numbers_zip_code(x))
After that I could use for this series method describe():
X.zip_code.describe()
Results:
Also I checked each element in series using circle for:
object_list = [] #create lists with columns type object and another with numbers
number_list = []
for i in X.zip_code:
if i == 'object':
object_list.append(i)
else:
number_list.append(i)
len(object_list)
Result: 0
And it's fine... but when I check whole dataframe (series with zip_codes is a part of df) and for this column (zip_code) type again is 'object' !!!! Really, I don't understand I should do...
X.dtypes
Results:
Try:
X['zip_code'] = X['zip_code'].astype('int64')