I'm reading a excel file into a DataFrame. I need to strip whitespace from all the cells, leaving the other cells unchanged in Python 3.5. for example:
from pandas import Series, DataFrame
import pandas as pd
import numpy as np
#read data from DataFrame
data_ThisYear_Period=[[' 序 号','北 京','上 海',' 广州'],\
[' 总计','11232',' 2334','3 4'],\
[' 温度','1223','23 23','2323'],\
['人 口','1232','21 321','1222'],\
['自行车', '1232', '21321', '12 22']]
data_LastYear_Period=DataFrame(data_ThisYear_Period)
print(type(data_LastYear_Period))
data_ThisYear_Period.apply(data_ThisYear_Period.str.strip(),axis=1)
Traceback (most recent call last): File "C:/test/temp.py", line 17, in data_ThisYear_Period.apply(data_ThisYear_Period.str.strip(),axis=1) AttributeError: 'list' object has no attribute 'apply'
How to strip whitespaces from Python DataFrame in this example
use applymap to the dataframe, applymap applies a lambda function on each cell. In the lambda function split the string (white spaces are ignored in it) and then join it. If there is an int, then you can use if else in lambda function.
from pandas import Series, DataFrame
import pandas as pd
import numpy as np
#read data from DataFrame
data_ThisYear_Period=[[' 序 号','北 京','上 海',' 广州'],\
[' 总计','11232',' 2334','3 4'],\
[' 温度','1223','23 23','2323'],\
['人 口',1232,'21 321','1222'],\
['自行车', '1232', '21321', '12 22']]
data_LastYear_Period=DataFrame(data_ThisYear_Period)
print data_LastYear_Period
data_LastYear_Period = data_LastYear_Period.applymap((lambda x: "".join(x.split()) if type(x) is str else x ))
print data_LastYear_Period
results in
0 1 2 3
0 序 号 北 京 上 海 广州
1 总计 11232 2334 3 4
2 温度 1223 23 23 2323
3 人 口 1232 21 321 1222
4 自行车 1232 21321 12 22
0 1 2 3
0 序号 北京 上海 广州
1 总计 11232 2334 34
2 温度 1223 2323 2323
3 人口 1232 21321 1222
4 自行车 1232 21321 1222
on a side note, you are getting this particular error because
data_ThisYear_Period.apply(data_ThisYear_Period.str.strip(),axis=1)
data_ThisYear_Period
is a list and not a pandas dataframe (data_LastYear_Period
)