When I try to fit the scikit-learn's StandardScaler into my pandas dataframe I get the following error:
TypeError: Feature names are only supported if all input features have string names, but your input has ['str', 'str_'] as feature name / column name types. If you want feature names to be stored and validated, you must convert them all to strings, by using X.columns = X.columns.astype(str) for example. Otherwise you can remove feature / column names from your input data, or convert them all to a non-string data type.
This error occurs in this part of my code:
scaler.fit(data[map_keys])
Here data is a dataframe and map_keys
is a list containing only string values. Here is a sample from the data:
>> data[map_keys].head()
outputs:
loss revenue visit_number ...
1964 1.0 0.0 1.0 ...
1402 2.0 0.0 1.0 ...
2539 2.0 0.0 1.0 ...
86 2.0 0.0 1.0 ...
808 2.0 0.0 2.0 ...
What I did to fix this issue was to convert all elements in map_keys
into str
type with:
map_keys = [str(k) for k in map_keys]
as some of the elements in the list were of type np.str_
when I first encountered the issue. But the error still persists...
Note that the scikit-learn version I use in this code is 1.2.1
.
For me the astype(str) solution was not working so I got around with:
X= X.rename(str,axis="columns")