I am experimenting with the Pandas loc()
method, used with boolean arrays as arguments.
I created a small dataframe to play with:
col1 col2 col3 col4
0 a 1 2 3
1 b NaN NaN 6
2 c NaN 8 9
3 d NaN 11 12
4 e 13 14 15
5 f 17 18 19
6 g 21 2 2 23
And a boolean array to use on axis 1 to subset a number of columns:
a1 = pd.Series([True, False, True, False])
I then tried:
df.loc[: , a1]
I got an error message:
IndexingError: Unalignable boolean Series key provided
How can I apply the boolean array to subset a number of columns with loc()
?
You need convert Series
to numpy array
by values
:
print (df.loc[: , a1.values])
col1 col3
0 a 2.0
1 b NaN
2 c 8.0
3 d 11.0
4 e 14.0
5 f 18.0
6 g 2.0
Or need add index
by df.columns
for alignment index
of Series
to columns
of DataFrame
:
a1 = pd.Series([True, False, True, False], index=df.columns)
print (df.loc[: , a1])
col1 col3
0 a 2.0
1 b NaN
2 c 8.0
3 d 11.0
4 e 14.0
5 f 18.0
6 g 2.0