Search code examples
pythonarrayspandasbooleanpandas-loc

Pandas loc() method with boolean array on axis 1


I am experimenting with the Pandas loc() method, used with boolean arrays as arguments.

I created a small dataframe to play with:

    col1    col2    col3    col4
 0  a        1       2       3
 1  b       NaN     NaN      6
 2  c       NaN      8       9
 3  d       NaN     11       12
 4  e       13       14      15
 5  f       17      18       19
 6  g       21  2    2       23

And a boolean array to use on axis 1 to subset a number of columns:

 a1 = pd.Series([True, False, True, False])

I then tried:

 df.loc[: , a1]

I got an error message:

IndexingError: Unalignable boolean Series key provided

How can I apply the boolean array to subset a number of columns with loc()?


Solution

  • You need convert Series to numpy array by values:

    print (df.loc[: , a1.values])
      col1  col3
    0    a   2.0
    1    b   NaN
    2    c   8.0
    3    d  11.0
    4    e  14.0
    5    f  18.0
    6    g   2.0
    

    Or need add index by df.columns for alignment index of Series to columns of DataFrame:

    a1 = pd.Series([True, False, True, False], index=df.columns)
    print (df.loc[: , a1])
      col1  col3
    0    a   2.0
    1    b   NaN
    2    c   8.0
    3    d  11.0
    4    e  14.0
    5    f  18.0
    6    g   2.0